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FOREWORD 


The  host  for  the  Thirty-Ninth  Conference  on  the  Design  of 
Experiments  in  Army  Research,  Development  and  Testing  was  the 
Department  of  Statistics  in  Rice  University.  Professor  James  R. 
Thompson,  Department  of  Mathematics,  invited  this  conference  to  be 
held  at  Rice  University.  He  was  asked  to  be  Chairperson  of  this 
conference  which  was  held  on  the  20  -  22,  October,  1993.  Dr. 
Thompson  was  assisted  in  this  task  by  Mrs.  Diane  J.  Brown, 
Department  Coordinator.  These  individuals  are  to  be  commended  for 
their  efforts  in  coordinating  all  the  details  required  to  conduct 
this  large  successful  scientific  meeting. 


Members  of  the  problem  committee  were  pleased  to  obtain  the 
services  of  the  following  distinguished  scientists  to  speak  on 
topics  of  interest  to  Army  personnel: 


Speaker  and  Affiliation  Title  of  Address 


Professor  Dennis  Cox 
Rice  University 


Professor  Katherine  B.  Ensor 
Rice  University 


Estimating  Parameters  in  Complex 
Computer  Codes:  Designing  the 
Computer  Experiments 

Properties  of  Simulation  based 
Estimators  of  Stochastic 
Processes 


Professor  Wei-Yin  Loh  Tree-Structured  statistical  Methods 

University  of  Wisconsin- 
Mad  is  on 


Professor  Emanuel  Parzen  Beyond  Classical  Statistical 

University  Methods:  Why  and  How 

Gave  the  Keynote  Address 


Professor  J.  Sethuraman  Contamination  of  Failure  Data  can 

Florida  State  University  Change  Nature  of  Failure  Rate  and 

Explain  the  Strength  of  Long  Life 
Units 


Professor  Nozer  Singurwalla 

and  Jiangxian  Chen 

George  Washington  University 


On  the  Reliability  of  Emergency 
Diesel  Generators  at  U.S.  Nuclear 
Plants 


iii 


This  conference  was  preceded  by  a  two  day  tutorial  entitled 
"Multivariata  Danaity  Estimation  and  Visual  Clustering”  prsssntsd 
by  Professor  David  w.  Boott  of  Rice  University.  The  purpose  of 
these  tutorials  is  to  develop,  in  Army  scientists,  an  interest  in 
and  an  appreoiation  for  the  statistical  methods  that  are  needed  to 
analyse  experimental  data. 

Dr.  Douglas  B.  Tang,  Chief  of  the  Department  of  Biostatistios  at 
the  Balter  Reed  Army  Institute  of  Research,  was  selected  to  receive 
the  Twelfth  U.S.  Army  Bilks  Award  for  contributions  to  statistical 
Methodologies  in  Army  Reserve  Development  and  testing.  Based  on 
his  diverse  research  productivity,  he  has  beoome  widely  recognised 
as  an  authority  on  clinical  trials,  medical  decision  making, 
bioassay,  and  laboratory  data  analysis. 

The  Program  Committee  has  requested  that  the  proceedings  of  the 
1993  oonferenoe  be  distributed  Army-wide  so  that  the  information 
contained  therein  oan  assist  scientists  with  some  of  their 
statistical  problems.  Finally,  committee  members  would  like  to 
thank  the  Program  Committee  for  all  the  work  it  did  in  putting 
together  this  scientific  meeting. 


Program  Committeo 

Gerald  Andersen  (ARO)  Carl  Bates  (CAA) 

Kevin  Beam  (RAND)  Barry  Bodt  (ARL) 

Robert  Burge  (WRAIR)  Eugene  Dutoit  (AIS) 

Jock  arynovioki  (arl)  Carl  Russell  (texcom) 

Douglas  Tang  (WRAIR)  Malcolm  Taylor  (ARL) 

Deloris  Testerman  (TEXCOM)  Jim  Thompson  (RICE  u.) 

Henry  Tingey  (U.  of  DB)  David  Cruess  (ubuhs) 

Francis  Dressel  (ARO)  Jerry  Thomas  (ARL) 
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THIRTY-NINTH  CONFERENCE  ON  THE  DESIGN  OF 

EXPERIMENTS 

IN  ARMY  RESEARCH,  DEVELOPMENT,  AND  TESTING 

18-22  October  1993 


Host:  Department  of  Statistics 
Rice  University 
6100  South  Main  St. 
Houston,  Texas 

Location:  Kyle  Morrow  Room,  Fondren  Library 


0800  -  0915  ,  REGISTRATION;  (Kyle  Morrow  Room  Lobby) 

0915  -  0930  CALL  TO  ORDER:  Jim  Thompson,  Rice  University 

OPENING  REMARKS:  (Michael  M.  Carroll,  Dean  of  Engineering, 

Rice  University) 

0930  -  1200  GENERAL  SESSION  I 

Chairperson:  Malcolm  Taylor,  Army  Research  Laboratory 

0930-1030  KEYNOTE  ADDRESS:  BEYOND  AOV  STATISTICAL 
METHODS 

Emanuel  Parzen,  Texas  A&M  University 
1030-1100  Break 

1100-1 200  PROPERTIES  OF  SIMULATION  BASED  ESTIMATORS  OF  STOCHASTIC 
PROCESSES 

Katherine  B.  Ensor,  Rice  University 

1200-  1330  Lunch 

1330  -  1500  CONTRIBUTED  SESSION  I 

Chairperson:  Linda  Moss,  Army  Research  Laboratory 

PARTIALLY  DUPLICATED  FACTORIAL  DESIGNS 
Peter  W.  M.  John,  University  of  Texas  at  Austin 


AN  APPLICATION  OF  GENERALIZED  P-VALUES  IN 

TANK  GUN  ACCURACY  RESEARCH 

David  W.  Webb,  Army  Research  Laboratory 


A  SERIES  OF  NEW  SUPERSATURATED  DESIGNS 

Margaret  G.  Ehm,  Marc  N.  Elliott,  and  Monnie  McGee,  Rice 

University 

1500-  1530  Break 

1530-  1700  CONTRIBUTED  SESSION  II 

Chairperson:  Carl  Russell,  TEXCOM 

JUDGING  STATISTICAL  SIGNIFIANCE  GRAPHICAL  METHODS  VS 

TRADITIONAL  PARAMETRIC  METHODS 

Jock  O,  Crynovicki,  Army  Research  Laboratory 

AN  EMPIRICAL  STUDY  OF  THE  DISTRIBUTION  AND  PROPERTIES  OF 
THE  SLOPE  ESTIMATOR  USING  THE  MINIMUM  NORMED  DISTANCE 
CRITERION 

Barbara  Wainwright,  Salisbury  State  University  and  Henry  B. 
Tlngey,  University  of  Delaware 

CHARACTERIZATION  RESULTS  IN  PROBABILITY 
Jerry  Andersen,  Army  Research  Office 

DETERMINATION  OF  DESIRED  DESIGN  AND  OPERATIONAL 
CHARACTERISTICS  OF  THE  SMALL  AREA  CAMOUFLAGE  COVER 
(SACC)  BY  GROUND  TROOPS 

George  Anitole  and  Ronald  L  Johnson  Belvolre  Research 
Development  and  Engineering  Center  &  Christopher  J.  Neubert, 
Army  Materiel  Command 

1 830  *  WILKS  AWARD  BANQJJET  (Cohen  House/Faculty  Club,  Rice 
University) 

1830-1930  Cash  Bar 
1930-  Dinner 


0800  -  0900  GENERAL  SESSION  II 


Chairperson:  Deloris  Testerman,  TEXCOM 

TREE-STRUCTURED  STATISTICAL  METHODS 
Wel-Yin  Loh,  University  of  Wlsconsin-Madison 

0900-0915  Break 
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0915-  1100  CLINICAL  SESSION 


Chairperson;  W.  J.  Conover,  Texas  Technological  University 
Panelists!  Bernard  Harris,  University  of  Wlsconsin-Madison 
Wei-Yin  Loh,  University  of  Wlsconsin-Madison 
J.  Sethuraman,  Florida  State  University 
Noser  Singpurwalla,  George  Washington  University 

MOBILITY  FACTOR  INFERENCE 

C.  Denise  Bullock  and  Nancy  Renfroe  Waterways  Experiment 
Station 

COMBINING  SIMULATION  RESULTS  ADDRESSING  ARMOR  VEHICLE 

RESEARCH,  DEVELOPMENT  AND  TESTING 

Paul  J.  Deason,  TRADOC  Analysis  Center-WSMR 

P-VULTE  (P-VALUE  UPPER  &  LOWER  TEST  ESTIMATION) 

Paul  H.  Thrasher,  Material  Test  Dlrectorate-WSMR 

1100-1115  Break 

1115-1215  CONTRIBUTED  SESSION  III 

Chairperson:  Doug  Tang,  Walter  Reed  Army  Institute  of  Research 

AUTOMATIC  CLASSIFICATION  OF  DOCUMENTS  BY  LEXICAL  CONTENT 
Mel  Brown,  Army  Research  Office 

AN  APPLICATION  OF  C IASSIFICATION  WITH  POTENTIAL  USE  IN 
REPRODUCTIVE  TOXICOLOGY 

Barry  A.  Bodt,  Army  Research  Laboratory  &  Ronald  J.  Young, 
Edgewood  Research,  Development  and  Engineering  Center 

1215  -  1330  Lunch 

1330-  1500  CONTRIBUTED  SESSION  III  (CONTINUED) 

IMPROVED  PERIODOGRAM  ESTIMATORS  FOR  THE  COSINOR  MODEL 
R.  John  Weaver  and  Marshall  Brunden,  The  Upjohn  Company  & 
Jonathon  Ra2,  University  of  Michigan 

CONFIDENCE  INTERVALS  AND  TESTS  OF  HYPOTHESES  FOR  NORMAL 

COEFFICIENTS  OF  VARIATION 

Mark  G.  Vangel,  Army  Research  Laboratory 

ANALYSIS  OF  GAS  FLOW  RESISTENCE  MEASUREMENT  THROUGH 
PACKED  BEDS. 

Malcolm  S.  Taylor  &  Csaba  K.  Soltani,  U.  S.  Army  Research 
Laboratory 

1500  -  1530  Break  (POSTER  SESSION,  Kyle  Morrow  Room  Lobby) 

DESKTOP  MODELS  FOR  WEAPONS  ANALYSES 
Eugene  Dutolt  and  John  D'Errico,  Infantry  School 


1530  -  1630  GENERAL  SESSION  III 


Chairperson:  Jerry  Thomas,  Army  Research  Laboratory 

ON  THE  RELIABILITY  OF  EMERGENCY  DIESEL  GENERATORS  AT  U.  S. 
NUCLEAR  POWER  PLANTS 

Noser  SingpurwaUa  &  Jlangxian  Chen,  George  Washington 
University 


Friday.  22  October  1QQ3 

0800  -  0900  GENERAL  SESSION  IV 

Chairperson:  Bob  Burge,  Walter  Reed  Army  Insltute  of  Research 

CONTAMINATION  OF  FAILURE  DATA  CAN  CHANGE  NATURE  OF 
FAILURE  RATE  AND  EXPIAIN  THE  STRENGHT  OF  LONG  LIFE  UNITS 
J.  Sethuraman,  Florida  State  University 

0900-0915  Break 

0915  - 1015  CONTRIBUTED  SESSION  IV 

Chairperson;  LTC.  Ronald  Scotka,  TEXCOM 

IDENTIFYING  THE  CRITICAL  FACTORS  IN  AN  ADAPTIVE  NETWORK 
Ann  E.  M.  Brodeen,  Barbara  Broome,  George 
Hartwig,  and  Maria  Lopez,  Army  Research  Laboratory 

ESTIMATES  OF  THE  NUMBER  OF  MONTE  CARLO  TRIALS  NECESSARY 
FOR  MOBILITY  SENSITIVITY  ANALYSES 
Andrew  Harrell,  Waterways  Experiment  Station 

1030-1200  GENERAL  SESSION  V 

Chairperson;  Barry  A.  Bodt,  Army  Research 
Laboratory,  and  Chairman  of  the  AMSC 
Subcommittee  on  Probability  and  Statistics 

OPEN  MEETING  OF  THE  PROBABILITY  AND  STATISTICS 
SUBCOMMITTEE  OF  THE  ARMY  MATHEMATICS  STEERING  COMMITTEE 

ESTIMATING  PARAMETERS  IN  COMPLEX  COMPUTER  CODES: 
DESIGNING  THE  COMPUTER  EXPERIMENT 
Dennis  Cox,  Rice  University 

ADJOURN 


Program  Committee 


Gerald  Andersen  (ARO) 

Kevin  Beam  (RAND) 

Robert  Burge  (WRA1R) 

Jock  Grynovicki  (ARL) 
Douglas  Tang  (WRAIR) 
Deloris  Testermon  (TEXCOM) 
Henry  Tingey  (U.  of  DE) 
Francis  Dressel  (ARO) 


Carl  Bates  (CAA) 

Barry  Bodt  (ARL) 

Eugene  Dutoit  (A1S) 

Carl  Russell  (TECCOM) 
Malcolm  Taylor  (ARL) 
Jim  Thompson  (RICE  U.) 
David  Cm  ess  (USUHS) 
Jerry  Thomas  (ARL) 


BEYOND  CLASSICAL  STATISTICAL  METHODS:  WHY  and  HOW 

by  Emanuel  Parzen 

Department  of  Statistics,  Texas  A&M  University 
College  Station,  TX  77843-3143 

ABSTRACT:  This  is  a  philosophical  and  technical  paper  about  future  directions 
of  statistical  theory  and  practice.  It  discusses:  1.  why  and  now  components  of  statistical 
reasoning,  2.  certified  professional  statisticians,  3.  statistical  computing,  4.  statistical  edu¬ 
cation,  5.  defining  the  problem  of  statistics  as  probability  modeling,  6.  statistical  education 
analogues  to  statistical  modeling,  7.  function  representations  of  data  and  mathematical  lit¬ 
eracy,  8.  the  P  value  problems  of  statistics,  9.  how  to  use  correlation  coefficients  to  develop 
beyond  statistical  methods 

0.  INTRODUCTION 

This  is  a  philosophical  and  technical  paper  about  future  directions  of  statistical  theory 
and  practice.  We  propose  that  the  concept  of  comparing  and  combining  classical  statistical 
methods  and  modem  data  analysis  methods  should  be  called  “Beyond  Classical  Statisti¬ 
cal  Methods”.  This  name  is  inspired  by  Hirotsu  (1993),  “Beyond  Analysis  of  Variance 
Techniques:  Some  Applications  in  Clinical  Trials” .  Hirotsu  reports  that  his  new  methods 
(such  as  max  chi-sauared  statistics  and  average  chi-squared  statistics)  are  being  accepted  in 
Japanese  statistical  guidelines.  One  goal  of  this  paper  is  to  present  a  framework  (in  section 
9)  which  shows  how  the  statistics  introduced  by  Hirotsu  are  related  to  other  conventional 
statistics. 

While  combining  conventional  and  modern  methods  has  a  history  of  academic  devel¬ 
opment  (Daniel  (1959),  Gnandesikan  (1980)),  it  may  not  be  much  practiced  as  yet  because 
applied  statisticians  have  a  tendency  not  to  use  methods  which  have  not  been  made  read¬ 
ily  available  to  them  in  statistical  computing  packages.  This  paper  argues  that  unified 
methods  can  impact  applied  research  and  statistical  education. 

The  technical  content  of  this  paper  is  the  final  section  which  outlines  our  research 
about  HOW  to  combine  non-parametric  quantile  and  Comparison  Change  Correlation 
techniques  with  classical  statistical  methods.  The  first  8  sections  discuss  from  various 
philosophical  viewpoints  WHY  this  research  should  be  on  the  agenda  of  statisticians  in 
a  society  whose  health  and  prosperity  is  increasingly  dependent  on  statistically  literate 
engineers,  scientists,  managers,  and  public. 

1.  WHY  AND  HOW  COMPONENTS  OF  STATISTICAL  REASONING 

I  believe  that  courses  and  talks  on  statistics  should  be  about  both  HOW  and  WHY. 

Academic  researchers  often  minimize  the  WHY  component,  because  a  HOW  talk  often 
emphasizes  “get  to  the  new  material  fast  without  worrying  about  motivating  the  results, 
since  to  enhance  your  reputation  impress  fellow  experts  in  the  short  attention  span  that  you 
have  available  that  yourve  done  something  new  and  which  works”.  We  say  that  the  HOW 
component  of  statistical  reasoning  is  often  “esoteric”  in  the  sense  that  it  is  specialized 
technical  in  a  way  that  appeals  mainly  to  experts. 

In  contrast,  the  WHY  component  of  statistical  reasoning  is  intended  to  be  “exoteric” 
in  the  sense  that  it  seeks  to  be  understandable  to  a  more  general  technical  audience  by 
motivating  WHY  the  methods  are  applicable  and  interpret  able. 


Presented  as  a  Keynote  Address  on  October  20,  1993  at  the  39th  Conference  on  the 
Design  of  Experiments  for  Army  Research,  Development,  and  Testing  at  Rice  University. 
Research  suppported  by  the  U.  S.  Army  Research  Office. 
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Statisticians  need  to  be  concerned  with  WHY  in  order  to  practice  in  their  work  the 

Deming  inspired 

Continuous  Improvement  Principle 

which  states  that  “every  action  should  be  judged  by  how  well  it  positions  you  for  subsequent 
actions”.  Methods  should  be  called  simple  not  by  whether  their  theory  is  easy  but  whether 
their  interpretation  can  be  made  easy  to  comprehend. 

To  enhance  their  quality  (and  competitiveness)  many  organizations  are  adopting  a 

Continuous  Improvement  Process, 

defined  as  a  team  approach  to  Total  Quality  Management  to  improve  products  or  services 
to  exceed  the  expectations  of  customers  or  clients.  An  understanding  and  implementation 
of  statistical  concepts  of  change,  variation,  and  measurements  is  clearly  important  in  this 
process  which  requires  that  decisions  be  based  on  the  information  in  data,  not  just  on 
opinions  and  guesses. 

2.  CERTIFIED  PROFESSIONAL  STATISTICIANS 

A  question  of  concern  to  a  broad  cross-section  of  applied  statisticians,  is  the  question 
of  professional  certification  of  statisticians.  Professors  should  be  interested  in  this  question 
because  I  believe  that  it  raises  fundamental  questions  about 

how  to  continuously  improve  statistics  courses. 

Several  ideas  that  I  believe  deserve  to  be  in  the  certification  discussion  are: 

(1)  Is  the  best  role  model  for  professional  certification  of  statisticians  an  exam  struc¬ 
ture  (similar  to  that  of  the  Society  of  Actuaries)  which  is  not  a  single  exam  but 
a  series  of  exams?  In  this  way  one  can  encourage  and  reward  two  or  more  levels 
of  advanced  statistical  literacy.  Statistical  culture  is  understanding  that  there  are 
several  levels  of  professional  statistical  literacy,  involving  different  aspects  of  the 
practice  and  theory  of  statistics. 

(2)  Should  certification  require,  in  addition  to  passing  exams,  a  lifelong  process  of 
Continuing  Education  credits?  Do  we  not  need  to  encourage  and  reward  keeping 
up  with  the  latest  developments  through  short  courses  and  attendance  at  profes¬ 
sional  meetings?  I  call  tins  process 

“studying  the  contemporary  history  of  statistics”. 

(3)  Certification  of  level  of  statistical  literacy  should  be  the  goal  of  exams  in  each 
statistics  course.  Statistical  educators  should  seek  concensus  about  the  content 
of  the  series  of  continually  updated  statistics  courses  that  would  provide  excellent 
education  in  applied  classical  and  modern  methods.  The  courses  should  have  both 
HOW  and  WHY  components. 

(4)  Statistics  programs  should  have  courses  that  focus  on  problems  of  communication 
and  collaboration  between  statisticians  and  scientists  (how  to  achieve  a  collabo- 
ratory  of  statistical  science). 

3.  STATISTICAL  COMPUTING 

An  increasingly  urgent  question  is  the  role  in  statistics  education  of 

statistical  computing  and  statistical  packages, 
especially 

(1)  how  to  enable  new  methods  to  be  quickly  made  available  to  applied  researchers, 

(2)  how  to  enable  methods  which  are  complicated  (in  theory  and  computation)  to  be 
made  simple  (in  presentation). 

A  major  issue  of  integrating  Statistical  Computing  into  the  practice  of  statistics  is: 
solving  the  problem  that  new  methods  are  considered  purely  academic  unless  user  friendly 
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software  to  use  them  is  available. 

A  major  issue  of  integrating  Statistical  Computing  with  statistical  education  is:  how 
to  use  statistical  packages  to  implement  “alternative’’  (self  learning)  classroom  cultures 
that  stimulate  students  to  develop  statistical  reasoning  abilities  by  real  life  experience 
which  expect  students  to  search  for  patterns,  relate  contrasting  ideas,  and  give  reasons 
and  arguments  for  the  issues  under  discussion. 

I  believe  that  currently  there  is  a  danger  in  introductory  courses  that  statistical  com¬ 
puting  is  taught  only  from  the  HOW  point  of  view  with  no  discussion  of  WHY  issues, 
especially 

how  can  statistical  computing  make  a  statistical  method  simple, 
and  the  impact  of  computing  (especially  graphics)  on  how  statistics  is  practiced. 

4.  STATISTICAL  EDUCATION 

How  to  change  the  teaching  of  statistics  is  now  being  frequently  discussed  at  statistics 
meetings;  Amatat  News  for  October  1993  (p.  13)  reports  the  revolutionary  views  of  David 
Moore  that  we  need  “new”  teaching  styles. 

The  goals  of  an  “alternative”  educational  philosophy  should  be  to  emphasize  both 
practice  and  theory  using  two  teaching  strategies: 

1.  “Never  tell  students  what  they  can  find  out  for  themselves.*’ 

2.  “Tell  students  about  those  thlnxs  which  they  will  And  most  difficult  to 
learn  by  themselves.” 

Other  goals  for  introductory  statistics  courses: 

ill  public  respect  for  statisticians, 

2)  the  recruitment  of  statisticians, 

3)  public  statistical  literacy,  awareness  that  in  every  activity  one  should  strive  to 
compare  “expectation”  with  “reality”. 

I  recommend  that  courses  discuss: 

the  “map”  of  statistics  (its  relations  to  other  disciplines  as  the  ’glue’  of  science); 
the  “contemporary  history  of  statistics”  (emphasizing  that  innovation  in  methods 
and  applications  are  constantly  occurring); 

its  culture  (why  statisticians  are  oriented  to  “continuous  improvement”  and  how  they 
keep  up  with  new  “hammers”  (methods)  and  “nails”  (applications)). 

We  must  be  pro-active  in  changing  the  current  attitude  among  undergraduate  students 
that  statistics  is  a  required  and  irrelevant  course,  to  be  remembered  as  little  as  possible. 

5.  DEFINING  THE  PROBLEM  OF  STATISTICS  AS  PROBABILITY 
MODELING 

Defining  what  statistical  science  i3  about  has  always  been  regarded  as  a  controversial 
act  (many  statisticians  reject  the  hypothesis  that  one  can  be  certain  about  the  study  of 
uncertainty).  We  should  be  aware  of  the  various  definitions  of  statistics: 

SI)  help  find  scientific  truth  about  probabilities  and  the  fit  of  observations  to  theory; 
2)  make  decisions  in  the  face  of  uncertainty  and  loss  functions; 

3)  model  uncertain  data  by  probability  models. 

I  believe  that  to  find  truth  (and  make  decisions)  one  must  explore  the  widest  range  of 
alternatives  (what  I  call  “going  to  the  edge”).  I  regard  as  most  operational  the  following 
definition: 

The  most  important  concepts  In  statistics  are  the  probability  model  and 
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likelihood;  statistical  thinking  combines  data  analysis  and  concepts  of  proba¬ 
bility. 

Introductory  statistics  courses  may  not  need  to  include  techniques  for  theoretically 
computing  probabilities  but  need  to  stress  that  probabilities  are  what  statistics  is  comput¬ 
ing  from  data. 

Guided  by  the  proverb  “if  your  only  tool  is  a  hammer,  every  problem  looks  like  a  nail”. 
I  proposed  (Parzen  (1993))  that  the  practice  of  statistics  can  be  regarded  as  combining 
“nails”  (fields  of  applications)  and  “hammers”  (general  methods  stated  mathematically 
which  are  combined  to  provide  a  custom  made  method  for  each  application,  not  just 
reducing  each  problem  to  fit  the  “simple”  techniques  the  statistician  knows). 

Current  important  applications  of  statistics  can  be  defined  as  analyzing  change  (ob¬ 
serving  and  measuring  changes  taking  place  in  society,  industrial  processes,  medical  treat¬ 
ments,  the  environment,  economic  indicators,  etc.). 

Current  methods  of  statistics  can  be  regarded  as  having  a  common  theme:  use  prob¬ 
ability  models  to  model  and  comprehend  populations  and  data,  by  an  iterative  process 
of  model  specification,  parameter  estimation,  and  model  checking  (eloquently  de¬ 
scribed  by  George  Box  ana  Gwilym  Jenkins  in  the  context  of  Time  Series  Analysis).  That 
applied  statistics  is  best  practiced  by  modeling  is  well  described  in  an  article  in  the  Septem¬ 
ber  1993  American  Scientist  by  Gauch. 

Bayesians  (of  the  dogmatic  type  who  preach  that  priors  are  not  just  techniques  but  are 
to  be  believed)  imply  that  statisticians  should  never  use  non-Bayesian  methods  (one  should 
not  analyse  data  for  which  one  does  not  have  prior  beliefs  about  the  model).  Modeling 
statisticians  believe  that  data  can  yield  patterns  and  models  which  provide  insights  which 
were  not  thought  of  before  the  data  analysis.  The  magazine  “The  Economist’^  (October 
9,  1993  issue)  states  that  principles  of  “data  anallysis  without  theory”  are  the  basis  of 
current  successful  applied  research  on  the  mathematics  of  finance  (investing). 

0.  STATISTICAL  EDUCATION  ANALOGUES  TO  STATISTICAL 
MODELING 

Strategies  for  solving  statistical  problems  are  emphasized  in  the  “new”  teaching  which 
aims  to  give  students  a  sense  of  purpose  and  direction  to  their  statistical  learning.  My  ma¬ 
jor  point  is  that  reforms  in  statistical  education  and  research  are  linked,  because  statistical 
learning  and  statistical  investigation  are  analogous,  because  both  require  a  cycle  of  model 
building,  which  one  usually  repeats  (iterates)  several  times  before  reaching  a  satisfactory 
conclusion. 

The  SIET  cycle  of  statistical  model  building  consists  of  four  stages: 

Stage  1  (S):  Specify  very  general  class  of  models. 

Stage  2  (I):  Identify  tentative  parametric  model. 

Stage  3  (E):  Estimate  parameters  of  tentative  model. 

Stage  4  (T):  Test  goodness  of  fit,  diagnose  improved  models. 

(The  slogan  could  be:  “To  SIET  (see  it)  is  to  understand  it.”) 

The  cycle  of  statistical  problem  solving  consists  of  four  stages: 

Stage  1  (P):  Pose  the  question,  form  expectations. 

Stage  2  (C):  Collect  the  data,  make  observations. 

Stage  3  (A):  Analyze  the  data,  compare  observations  and  expectation. 

Stage  4  (I):  Interpret  the  results,  find  the  best  theory  or  decision  that  fits  the 

data. 

The  PCAI  cycle  of  a  statistical  investigation  should  be  represented  in  a  diagram  as  a 
circular  process  (rather  than  a  linear  process);  see  figure  from  p.  183  A.  Graham  (1993). 
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We  prefer  to  call  the  cycle  EOCI  (Expect,  Observe,  Compare,  Interpret). 

Reformers  of  mathematics  education  take  the  view  that  teachers  should  communicate 
the  four  aspects  of  learning  which  cognitive  sciences  recommend  for  success: 

1.  simple  recall, 

2.  algorithmic  learning, 

3.  conceptual  learning,  and 

•4.  problem  solving  strategies. 

In  statistical  teaching  we  can  make  these  cognitive  concepts  more  concrete  by  teaching 
that  statistical  concepts  (such  as  the  sample  mean  or  sample  variance)  have  three  aspects: 

1.  how  to  define  it  (mean  of  sample  distribution); 

2.  how  to  compute  it  (average  the  values  or  the  quantile  function); 

3.  how  to  interpret  it  (estimate  location  parameter  of  sample); 

The  fourth  aspect  or  statistical  learning  consists  of  ideas  about  combining  concepts  to 
conduct  an  iterative  statistical  investigation  whose  output  is  data  models  which  can  be 
applied. 

When  one  is  a  discussant  of  a  technical  paper  it  may  be  helpful  to  use  the  four  aspects 
of  learning  as  a  basis  for  evaluation. 

One  reason  the  definition 

<(the  methods  of  statistics  are  modeling” 

may  be  controversial  among  statisticians  is  because  many  introductory  statistics  courses 
adopt  approaches  which  avoid  the  use  of  concepts  of  probability. 

Statistics  and  probability  need  to  be  linked  not  only  to  define  probability  models  but 
in  order  to  make  judgements  (and  simulations)  about  how  to  interpret  the  significance  of 
a  set  of  results,  to  explain  that  unusual  results  do  sometimes  occur  just  by  chance. 

Professors  of  education  report  that  the  dilemma  of  mathematics  education  reform 
is  that  it  requires  teachers  to  nave  a  deeper  understanding  of  ideas  and  concepts,  which 
they  are  reluctant  to  study.  Teachers  prefer  “ready  to  apply"  modules  rather  than  profes¬ 
sional  development.  Sow  can  we  overcome  these  inhibitions  to  mathematics  and  statistics 
educational  reform? 

7.  FUNCTION  REPRESENTATIONS  OF  DATA  AND  MATHEMATI¬ 
CAL  LITERACY 

The  philosophy  of  “Beyond  Classical  Statistical  Methods”  proposes  that  to  practice 
statistics,  one  must  be  aware  of  the  relations  between  statistics  and  computing,  between 
statistics  and  probability,  and  between  statistics  and  mathematics. 

Early  childhood  study  of  statistical  data  analysis  and  probability  is  now  regarded 
as  critical  to  developing  mathematically  literate  students  who  can  function  in  a  society 
driven  by  technology.  Current  mathematics  educational  reform  movements  believe  experi¬ 
ence  (with  statistical  data  analysis)  is  the  ideal  way  to  teach  and  reinforce  mathematical 
concepts;  I  propose  that  statistics  can  benefit  from  mathematical  tools  (such  as  represen¬ 
tations  of  data  by  functions). 

My  Comparison  Change  Correlation  statistical  methods  emphasize  innovations  in 
functions  that  can  be  used  to  describe  probability  relationships  and  the  “shape"  of  data. 
These  functions  are  defined  on  the  unit  interval  (denoted  [0,1]  or  0  <  u  <  1  or  0  <  t  <  1) 
and  the  unit  square  (denoted  0  <  t,  u  <  1);  they  can  be  plotted  and  integrated  by  their 
shapes,  as  well  as  their  numerical  magnitudes,  and  yield  functional  statistics. 

“Safe”  (best)  statistical  methods  provide  two  hypotheses  between  which  the  researcher 
must  choose.  In  the  Comparison  Change  Correlation  approach,  the  null  hypothesis  of  no 
relationship  is  formulated  as  implying 

data  representation  on  [0,l]=white  noise 
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while  the  alternative  hypothesis  implies 

data  representation  on  [0.1]=signai+white  noise. 

Test  statistics  are  “linear  detectors”of  the  form  integral  over  [0,1]  of  the  product  of  the 
data  representing  function  and  the  signal  representing  function.  Quadratic  detectors  are 
sums  of  squares  of  linear  detectors.  Information  theory  detectors  are  entropy  measures  of 
comparison  density  estimators. 

Factoid:  The  concept  of  null  hypothesis  was  introduced  by  R.  A.  Fisher  as  a  hypothesis 
set  up  for  the  purpose  of  being  nullified  (invalidated).  Source:  Fisher  (1990),  p.  322. 

8.  THE  P  VALUE  PROBLEMS  OF  STATISTICS 

Statistics  has  as  its  goals  specification  and  identification  of  models  that  fit  data,  and 
assigning  values  ”  to  models  selected  by  multiple  comparisons.  If  we  use  modeling  meth¬ 
ods  to  decide  which  of  two  treatments  is  better  the  client  wants  and  expects  a  p  value  for 
our  conclusion!  Answering  such  distributional  questions  may  be  feasible  using  computer 
intensive  re-sampling  methods  which  can  generate  the  distnbution  of  the  statistics  that 
we  propose  to  test  relationships. 

I  would  like  to  tell  you  a  true  story  that  happened  to  me  in  Israel  in  September  1993 
on  a  bus  to  the  Weizmann  Institute,  when  a  statistician  meets  a  scientist,  one  often  gets 
the  reaction: 

“All  scientists  need  statisticians  (good  news). 

But  we  do  not  need  them  very  much  (bad  news). 

How  complicated  is  it  to  compute  a  p  value?** 

Revising  this  attitude  requires  a  public  relations  campaign  to  educate  the  scientific 
public  about  “Beyond  Classical  Statistical  Methods.’* 

9.  HOW  TO  USE  CORRELATION  COEFFICIENTS  TO  DEVELOP 
BEYOND  STATISTICAL  METHODS 

This  section  is  a  technical  outline,  without  examples,  of  Comparison  Change  Cor¬ 
relation  statistical  methods,  emphasizing  HOW  conventional  statistical  methods  can  be 
expressed  in  terms  of  diverse  correlation  coefficients. 

We  start  with  the  multi-sample  problem  that  we  reformulate  as  data  analysis  of  bi¬ 
variate  ( X,Y ).  Multi-sample  statistical  data  analysis  arises  when  observe  a  variable  Y 
in  c  cases  or  samples  (corresponding  to  c  treatments  or  c  populations).  The  samples  are 
usually  regarded  as  the  value  of  c  variables  Y\,,,..YC  with  respective  true  distribution  func¬ 
tions  JFi(y); . . . , Pc(y)  and  quantile  functions  Qi(u), . . .  ,Qc(w).  The  general  problem  is  to 
model  now  the  distribution  functions  Fy  vary  with  the  value  of  the  conditioning  variable 
k  *a  1, . . . ,  c,  and  in  particular  to  test  the  hypothesis  of  homogeneity  of  distributions: 

Hq  :  F\  =  . . .  =  Fc  -  F 

The  distribution  F  to  which  all  the  others  are  equal  under  Hq  is  considered  to  be  the 
unconditional  distribution  of  Y  (which  is  estimated  by  the  sample  distribution  of  Y  in  the 
pooled  sample). 

For  k  =s  1, . . . ,  c,  we  observe  a  random  sample  Yk(j).j  =  1, . . . ,  nu  for  k  =  1, . . . ,  c. 
The  pooled  sample,  of  size  n  =  r»i  +  . . .  +  nc,  represents  observations  of  the  pooled  ( or 
unconditioned)  variable  Y.  The  c  samples  are  assumed  to  be  independent  of  each  other. 

We  propose  that  we  regard  the  data  as  consisting  of  bivariate  observations  (X,  Y), 
where  X  represents  the  population  k  =  1, . . . ,  c  observed  and  Y  the  response  observed.  The 
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observation  that  is  usually  denoted  YUj)  is  denoted  in  our  notation  (X  =  k,  Y  =  Yj.(;)). 
While  X  is  a  deterministic  variable  rather  than  a  random  variable,  the  probability  notation 
we  use  can  be  interpreted  for  both  cases.  The  marginal  (unconditional)  distribution  of  X 
is  specified  by  the  probability  mass  function 

PxW  =  n*/n. 

The  distribution  function  of  X  is  defined 

Fx(x)  - 

*<* 

Define  the  indicator  function  1(B)  of  an  event  B  to  be  1  or  0  according  as  the  event 
B  did  or  did  not  occur.  Thus  I(X  =  kj  denotes  the  indicator  function  of  the  event  X  as  jfe, 
which  equals  1  or  0  according  as  X  ■  «  or  X  k,  I(Y  <  y)  denotes  the  indicator  function 
of  the  event  Y  <>  y.  The  distribution  function  of  the  values  of  Y  in  the  k- th  sample, 
previously  denoted  F^,  is  now  described  in  the  notation  of  conditioned  distributions  of  Y 
given  X: 

n(v) « *V|jr.i(v)  -  my  <  v)\nx  =  *» 

We  henceforth  use  empirical  distributions  (based  on  the  obse  rved  data)  rather  than 
theoretical  distributions  (based  on  the  unobserved  population).  Then 

Fy\x.*(v)  -  W  £  v)|.V  - *] 

-  (i/n*)  £  I(-x  -  my  <  v) 

observations  (X,Y) 

-Ew-mysyWpxtty 

An  important  general  formula:  for  function  g(Y)  and  set  B  of  real  numbers 

E(g(Y)\X  is  in  B]  *  E{g(Y)I(X  is  in  B)]/P[X  is  in  B] 

An  important  general  concept  is  correlation  coefficient.  We  now  show  that  correlation  can 
be  used  to  describe  a  statistic  that  is  a  conditional  mean: 

R(X  is  in  B,g(Y))  m  CORR[I(X  is  in  B),g(Y)\ 

-  E[I(X  is  in  B)(g(Y)  -  E(g(Y)])/a(g(Y)\/c\I(X  is  in  £)] 

-  ( oddsP[X  is  in  B])'5B[(p(K)  -  E[g(Y)})/a[g(Y)}\X  is  in  B] 

where  we  define  odds(p)  -  p/(  1  -  p).  Note  that  P[X  is  in  B]/o[I(X  is  in  B )]  = 
(oddsP[X  is  in  2?])'8. 

The  pooled  sample  has  unconditional  empirical  distribution 

Fy(v)  =  (1/n)  E  I(Y  <  y ) 

observations  (X,Y) 

The  empirical  quantile  function  of  Y  is  denoted  Qv(u),  0  <  u  <  1,  and  is  piecewise 
constant  between  points  ti  satisfying  Fy(Qy(u))  =  u,  called  exact  values  of  u;  exact  values 
u  are  of  the  form  u  =  Fy(y )  for  some  y.  The  quantile  function  of  X  is  denoted  Qx(t) , 
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0  <  t  <  1,  and  is  piecewise  constant  between  points  t  satisfying  FxiQxi*))  —  called 
exact  values  of  t\  exact  values  of  t  are  of  the  form  t  —  Fx(x)  for  some  x. 

To  test  the  null  hypothesis  ifp,  three  main  methods  are  proposed  in  introductory 
statistics  courses,  based  on  comparing  (1)  means,  (2)  scored  ranks,  (3)  distribution  func¬ 
tions.  We  propose  to  unify  these  methoas  by  expressing  the  test  statistics  in  terms  of  basic 
types  of  indicator  correlations: 

(1)  R(X  *  Qx(t),  Y)  -  CORR[I{X  =  $*(*))),  HO  <  t  <  1; 

(2)  RfA1  *»  Qx(t))  scored  ranks  of  Y)  m  (20nR[l(x  *  Qx(i)),  scored  ranks  of  Y],0  < 

(3)  R(X  m  Qx(t ),  Y  S  Qy (u))  *  CORR[I(X  =  Qx(t))}  I(Y  <  Qy  (u))],  0  <  t,  u  <  1. 

Additional  statistics  to  be  investigated  for  multi-sample  problems  are  accumulation 
correlations: 

(4)  R(X  £  Qx(t),Y)  *  CORR[I(X  <  gy(<)),no  <  t  <  15 

(t)  R(X  <  Qx(t)>  scored  ranks  of  Y)  ■  CORR[l(X  <  Qx(*))i  scored  ranks  of  Y], 0  < 

(8)  R(X Qx(t)>Y<<  Qy(u))  m  CORR[I(X  <  Qx(*)),I(Y  £  Qy(ti))],0  <  f,u  <  1. 

To  motivate  how  the  indicator  correlations  (1)  arise  in  the  Analysis  of  Variance  we 
introduce  the  following  notation.  The  sample  mean  Yu~  of  the  h- th  sample  is  the  conditional 
mean  of  Y  given  X  kt 

JT[y|X  =  *]  -  Y„-  =  (l/n,)  £  YhU)  =  Px(mY«X  =  *)] 

jaal 

The  pooled  sample  mean  io  the  unconditional  mean  of  Y: 

y-  -  £[y]  -  y  pxwmx  -  *]  -  £,px(m- 

km 1  kml 

The  unconditional  variance  of  Y,  and  the  conditional  variance  of  Y  given  X  -  k,  are 
respectively  denoted 

VAR[y]  -  *2{Y)  -  Y  f](nO')  -  Y-f/n, 

jbal  jail 

VAR|y|x  -  *]  =  Y(Yk(i)  -  m2/"*- 
1=1 

The  common  variance  <72  of  Y  under  Hq  is  estimated  by  the  pooled  variance 

c 

<r*2  =  E[VAR[Y|Jt]]  =  £p*(Jfe)  VAR[Y|*  =  lb] 

k=l 

Define  the  multiple  correlation 

R2[Y\X]  m  VAR[£[Y|X]]/ VAR[Y] 
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What  may  be  novel  is  the  observation  that  one  can  write 

fls[y|x]  »  £(1  -  rxWW*  = 

k=l 

From  the  important  representation 

VAR[y]  =  i?[VAR[y|A:]]  +  VAR[E[r|X]], 

infer  that  the  pooled  variance  <7*2  can  be  shown  to  be  related  to  the  original  variance 
VAR[r]  by 

cr*2  =  VAR[y](l  -  R2[y|X]) 

The  F  statistic  used  in  the  Analysis  of  Variance  to  <  st  Hq  can  be  shown  to  be 
(n  -  c)T2/(c  - 1),  defining 

T2  a  B?[Y\X]/{1  -  R2[K|X]) 

a^(l-p^(fc))r2(Xafc), 

Jb=l 

i 

defining 

T(X  a  Jfc)  =  R(x  a  fc,r)/(  1  -  JJ2[yjX])‘5  a  {odds  pX(Jfe))-8  (Y'k  “  Y~)  /<**• 

A  plot  of  (n  -  c)'5  T(X  a  Qx(*))t  0  <  t  <  1,  can  help  determine  which  sub-samples  are 
moat  different  from  the  others.  Note  (n  -  c),sT(X  a  h)  has  a  Student-*  distribution  with 
n  -  c  degrees  of  freedom  under  Hq  and  normality,  while  (n  -  e)T2 /(c  - 1)  has  F(c- l,n-c) 
distribution. 

The  foregoing  discussion  has  outlined  how  a  conventional  statistical  method  (one  way 
Analysis  of  Variance)  can  be  expressed  in  terms  of  correlations.  We  next  state  results  for 
expressing  other  conventional  and  beyond  methods  in  terms  of  correlations. 

R(X  s=  x,Y  -  y)}  Contingency  Table  Analysis 

The  chi-scmared  statistic  Chi  used  to  test  independence  in  a  contingency  table  of  n 
observations  (a,  Y)  where  X  has  c  possible  values  and  Y  has  r  possible  values  can  be 
expressed  Chi«=  nCf(X ,  Y)}  in  terms  of  a  probability  concept 

C{ X}  Y) «  Y,  £(p*,y(*>  v))  “  px(*)py(v))2/px(*)pv(y)> 

X=1  l/=l 

expressed  in  terms  of  (empirical)  probabilities,  We  propose  to  interpret  this  formula  in 
terms  of  indicator  correlations 

R(x  m  m,Y  =  p)»  (px,y(*,v))  -  px(*)py(v))/(px(*)py(v)(i  -pxWX1  -py(y)))'5; 

then 

C(X,  V)  -  £  22(1  -  M(*))(l  -  py(l/))l*(*  =■  »,  Y  =  f  )|2 

2=1 y=l 
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To  study  the  independence  of  X  and  Y  given  data  on  ( X ,  Y )  we  propose  a  “Chi-square 
and  Indicator  Correlation  Tableau”,  consisting  of  the  r  by  c  matrix  n'°R( X  =  x,Y  =  y) 
and  bordering  rows  and  columns  nC/(s,  .),  nC(.,  y),  nCav)  defining 

C(x, .)  rn  (r  -  l)-1  52(1  -  pyM) \R(X  =  x,Y  =  „)|» 

V 

C(.,  y)  *  (c  -  l)“l  £(1  -  px(*))|J?(^  =  »,  Y  =  y)|2 
« 

Cav - (c  - 1)-1  £(i  -  px(*))C(«,  .)“(*•- 1)"1  £(i  -  py(y))c(-»  y) 

0  y 


We  assign  p  value  to  these  statistics  as  tests  of  the  null  hypothesis  Hq  by  using  their 
known  asymptotic  or  exact  distributions  under  the  null  hupothesis.  The  use  in  practice  of 
these  statistics  is  beat  illustrated  by  examples  which  require  their  own  paper  to  discuss. 

Rather  than  a  table  of  R(X  m  —  y)  we  prefer  a  graphical  presentation  of 

n*g(x  -  Qx(i),r  -  QrW) 

as  either  a  function  on  0  <  u  <  1  for  each  exact  t  fixed,  or  as  a  function  on  0  <  t  <  1  for 
each  exact  u  fixed.  We  also  plot  nC(<Jjy(t),.),n0(.,0y(t<)). 

The  chi-squared  statistic  C  is  a  portmanteau  or  omnibus  statistic.  When  it  is  sig¬ 
nificant  we  want  to  know  the  cause  of  the  rejection  of  independence,  the  nature  of  the 
dependence,  which  can  be  obtained  from  the  above  plots  which  show  which  coefficients  are 
most  significant. 

Comparison  Analysis 

The  ultimate  approach  to  modelling  is  to  estimate  and  interpret  comparison  density 
d(u\i)  and  comparison  distribution  D(ujt): 

d(u|t)  *  d(ti;JFV,Fy|iY«gjr(<))*0  <  u  <  1; 

D(u\t)  =  jf“  =  J5(«;  #V,  *V| 

If  u  and  t  are  exact  values  in  the  sense  that  they  satisfy  u  *»  (y),  <  «  Fx(x)  *ot  80me  V 

and  x,  one  can  show  that 


D(u|t)  =  ipy|jfaQjr(t)(9Y(,0) 

The  joint  dependence  density  d(f,u)  is  defined  as  a  comparison  density 
d(t,u)  -  d(u|t)  =  d(t\u)  =  iT?C.-F>:|V=Qv(u)),0  <  t  <  1; 

The  joint  dependence  distribution  or  copula  function  is  defined  by 

D(t,u)=  [  f  dit1  yU^dt1 dv!  =  /  D(u\t')dt'. 

J 0  J  0  Jo 
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The  changePP  process  is  defined  on  0  <  u  <  1  for  fixed  exact  t  by 

CPP(u\t)  m  r  cPPiu'l^dv!  m  (odds(px<?*(f)))-5(I?(u|t)  -  u) 

J  0 

The  change  distribution  is  defined  on  0  <  t  <  1  for  fixed  exact  u  by 

■D(<|tt)  »  /  ss  /  d(t\u)dt/. 

Jo  Jo 

R(X  m  x}Y  ^  y),  Multi-Sample  Comparison,  Accumulation  Analysis 
R(X  £  x,Y  -  y),  Change  Analysis  of  a  Response 

The  chi-square  statistic,  based  on  correlations  R(X  =*  x,Y  »  y),  is  most  appropriate 
to  compute  when  X  is  discrete  and  Y  is  discrete.  Alternative  correlations  for  diagnosis  of 
the  dependence  of  discrete  X  and  discrete  K,  and  essential  correlations  when  one  variable 
is  continuous  and  the  other  is  discrete,  are  tne  accumulation  correlaton  coefficients 

R(X  m  xtY  £  y)  m  ( odds  px(x)odds  Fy(y))*((FY\XmM/fV(v))  ~  1) 

At  exact  u 

R(X  -  (J '*(0,1"  S  Qy(u))  -  (.oddapx(Qx(t)))'l<,D(,u\Fy,FY^XmQx^)-u)/(u(l-u)yl 

We  could  plot  for  each  exact  t  a  change  PP  process  CPP(u\t))  0  <  ti  <  1,  which 
compares  conditional  and  unconditional  distributions  and  is  asymptotically  a  Brownian 
Bridge  under  Ho: 

CPP(u|<)  «  ( odds  Px(Qx(t)))'\D(u'Fy,Fy\x*Qx(t))  “  «) 

We  always  plot  for  each  exact  t  change  test  process  which  is  a  collection  of  accumula¬ 
tion  correlation  coefficients 

CT(u|()  -  CPJ>(u|()/(u(l  -  «))-5  -  R(X  -  Qx(i),  V  <  Qy(u)) 

Recall  that  the  set  of  exact  u  values  consists  of«  »  fy(y),  V  —  1,. . .  ,r  -  1.  The 
(Hirotsu)  maximum  chi-square  statistic  is  defined  for  each  treatment  t  (more  precisely, 
treatment  x  with  t  m  Fx(uc)) 

R?accummax(t)  m  n  max  |P(A"  =  Qx(t)>Y  ^  Qy(u))|2 

exact  u 

R*accumave(i)  an  ^  =  Qx(t),Y  <  Qy(u))|2/(r  -  1) 

exact  u 

By  introducing  weights  W(u),  such  as  W(u)  =  (u(l  —  «)),  one  can  define  weighted  Hirotsu 
statistics: 

RraccummaxW(t )  =  n  max  W(u)|CT(u|t)|2, 

exact  u 

R?accumaveW(t )  =  n  ^  W(u)|CT(u|f)|2/(r  -  1) 

exact  u 
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For  contingency  tables  ( X  discrete,  Y  discrete)  these  statistics  provide  alternatives  to 
the  standard  Chi-squared  statistic  to  test  for  independence. 

For  multi-sample  problems  (X  discrete,  Y  continuous)  they  provide  goodness  of  fit 
type  statistics  for  testing  homogeneity  of  populations.  Modeling  rather  than  testing  is 
provided  by  density  estimation  techniques  which  estimate  cJ°P(u|t),  the  change  PP  density 
or  derivative  of  the  Change  PP  process. 

One  of  the  accomplishments  of  our  research  is  to  relate  the  accumulation  statistics 
introduced  by  Hirotsu  (1993)  to  conventional  statistical  methods. 

f2(AT  -  x,  Y),  Two  sample  and  Multi-sample  tests  of  homogeneity  Analysis 
of  Variance  (One  Way) 

The  output  of  the  R(X  «  x,  Y)  command  is  a  plot  of  the  change  test  density  R(X  = 
Qy(t).  Y),  0  <  t  <  1,  ana  the  values  of  the  conventional  F  test  statistics  of  the  Analysis 
of  Variance. 

R(X  <  x,  Y) ,  Change  Analysis  of  Multi-Samples 

Plot  R(X  Qy(t),  Y),  0  <  t  <  1,  and  the  corresponding  max  and  ave  statistics. 

R(X  *=  x,  ranks),  Non-parametric  tests,  Wltcoxon,  Kruskal- Wallis 
R(X  <  x,  ranks),  Non  parametric  Change  analysis  of  multisamples 

We  define  ranks  to  be  a  transformation  of  Y  to  Py(Y ),  where  Py(y)  is  the  mid 
distribution  function 

Py(y)  m  JV(y)  -  .5 py(y) 

R(X  «  x,  scored  ranks) 

R(X  <  x,  scored  ranks) 

Scored  ranks  are  a  transformation  of  Y  to  J(P(YV),  where  J(u)  is  a  score  function,  often 
chosen  to  be  a  Legendre  orthogonal  polynomial.  Their  correlations  can  be  used  to  guide 
estimation  of  comparison  densities. 

JR(scored  ranks,  transformation  of  y).  Change  analysis  of  data  Y  trans¬ 
formed  by  one  of  tne  transformations  J(r  ■  y),  iTY  £  y),  y,  Py(Y)}  J(Py(Y))) 
and  guide  to  estimation  of  change  density  (non-parametric  regression). 

R((X}Y\R(Px(X),  Py(Y )),  R(X,  Py(Y))}  R(Px(X)fY),  Correlations  and  non- 
parametric  Spearman  correlations 

Compute  the  correlations  and  plot  the  functions  of  which  they  are  diagnostics.  The 
important  formula 

B(X,  Y)  =  jf  W(t)  -  X-)lo[X\)E[(?  -  n/» P1)|jr  =  QxW  1  * 

suggests  that  we  plot  the  two  functions  on  0  <  t  <  1  that  are  in  the  integrand.  Smoothing 
the  second  function,  called  the  change  density,  is  the  problem  considered  m  non-parametric 
regression. 
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Abstract:  Simulation  based  methods  of  estimation  have  proven  a  useful  tool  for 
parameter  estimation  of  complicated  stochastic  processes.  We  examine  a  simulation 
based  estimation  procedure  comparable  to  the  conditional  least  squares  estimates 
for  parameters  of  a  stochastic  process.  The  simulated  conditional  least  squares 
estimates  are  shown  to  be  consistent  and  the  asymptotic  distribution  is  derived. 

KEY  WORDS:  SIMEST,  Conditional  Least  Squares 

1.  INTRODUCTION 

Ensor,  Bridges  &  Lawora  (1003)  illustrated  the  viability  of  estimation  for 
stochastic  processes  through  simulation.  Their  method  built  on  the  original  work  by 
Thompson,  Brown  and  Atkinson  (1987)  in  this  area.  The  premise  of  such  estimators 
is  that  a  process  can  be  simulated  directly  from  the  defining  axioms.  Parameter 
estimates  are  then  obtained  by  minimizing  over  the  parameter  space  some  measure 
of  error  between  the  simulated  process  at  a  given  point  in  the  parameter  space  and 
the  observed  series.  The  acronym  SIMEST,  for  simulation  based  estimation,  is  used 
to  describe  this  general  method. 

This  method  of  estimation  has  been  successfully  applied  in  the  area  of  market¬ 
ing  by  Bridges,  Ensor  and  Thompson  (1992).  They  model  the  number  of  “types”  of 
personal  computers  in  the  marketplace  at  time  t,  A  personal  computer  is  considered 
a  new  type  if  something  about  the  technology  changed,  for  example  the  486  chip 
replacing  the  386  chip.  The  proposed  stochastic  model  was  not  solvable  in  closed 
form.  Using  SIMEST  they  were  able  to  use  the  proposed  stochastic  model  rather 
than  resulting  to  the  simplifying  assumption  of  a  deterministic  model  plus  random 
noise.  Another  example  in  marketing  is  presented  by  Bridges,  Ensor  and  Raman 
(1994).  They  model  the  number  of  customers  for  a  particular  home  inspection  Arm 
in  the  Los  Angeles  area  as  a  birth  and  death  process  with  constant  death  rate  and 
a  birth  rate  which  is  a  function  of  advertising  expenditures. 


The  proximity  measure  minimized  .determines  the  type  of  estimators  found  via 
SIMEST.  In  this  paper,  simulation  based  estimators  comparable  to  conditional  least 
squares  estimators  for  stochastic  processes  will  be  examined. 

Let  {N(t),t  >  0}  denote  the  stochastic  process  of  interest  which  is  observed  at 
n  different  time  points,  i.e.  N(t{ N(tn).  For  simplicity  in  rotation,  we  refer  to 
the  observed  process  at  time  points  t\, . . . ,  tn  as  Yi, . . . ,  Yn.  If  one  can  simulate  the 
2?[Fi]  or  E[Y{\Yi-i]  for  *  —  1,. . .  ,n  in  theory  a  SIMEST  estimator  can  be  obtained. 
As  an  example  of  the  use  of  SIMEST  in  this  setting  consider  a  general  birth  and 
death  process. 

1.1.  Simulation  of  Birth  and  Death  Processes 

Consider  the  Markov  counting  process  N(t)  with  parameters  An  and  fin  which 
satisfies  the  following  axioms: 

i)  P(N(t,  +  Si)  ~  n  +  lj N(t)  =  n)  =  \n8t  +  o(6t) 

ii)  P(N(t  +  6t)  as  n  -  l\N(t)  *  n)  —  fin8t  4-  o(M) 

iii)  The  probability  of  more  than  one  event  in  (t,t  +  tft]  =*  o(6t). 

From  the  above  axioms  it  is  simple  to  derive  the  distribution  of  the  time  of 
the  next  arrival,  Fs(t)  and  the  distribution  of  the  time  of  the  next  exit  from  the 
system,  Fo(t)  so  that 

Fo(t)  *  1  -  P{0  births  in  (t,  t  +  <$<]}=:  1  -  e“An< 

and 

Fo(t)  «a  1  -  P{0  deaths  in  (t,<  +  <$t]}  =  1  - 

Using  the  inverse  c.d.f.  transformation  we  obtain  obtain  the  time  until  the  next 
birth,  tn,  or  death,  (tc,  in  our  process  from 


TQ  —  .  Ol  tp  —  j 

fhi 


(i) 


where  U i  and  Uj  represent,  independent  random  variables  from  the  uniform  distri¬ 
bution  defined  over  the  unit  interval,  It  is  then  a  simple  matter  to  simulate  the 
conditional  mean  of  Yi  given  the  observed  value  of  Yi- 1  as  the  following  algorithm 
illustrates.  Let  Xij(9)  denote  the  jl,i  simulated  value  of  the  process  at  time  t j  given 
the  observed  value  at  t j_i  or  Yi_i  as  the  starting  point  of  the  simulation.  The  pro¬ 
cess  is  simulated  assuming  parameter  9.  Also,  let  J?j,m(0)  =*  (1/m)  Yj’JLi 
In  other  words,  is  the  simulated  conditional  mean  based  on  m  realizations 

of  the  process  at  time  ti  given  the  value  of  the  process  at,  time  i. 
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Simple  Algorithm  to  Simulate  Xi,m(8 ) 

1. 

Set  k  -  Y(-i. 

2. 

Simulate  U\  and  U2  from  U(0,1)  distribution. 

3. 

Compute  to  and  to. 

4. 

Set  t  =  t  +  min(tfl,  to). 

5. 

If  to  <ts  then  k  =  k  -  1  else  k  =»  k  +  1. 

6. 

If  t  <  t(  — 1(- 1  go  to  2  otherwise  X ij(B)  =  k, 

7. 

Repeat  1-6  m  times.  Average 

X(, i(0), ....  X iltn(B)  to  obtain  Xi<m(B). 

8. 

Move  to  time  i  +  1,  go  to  1. 

An  important  consideration  is  that  the  computation  of  to  and  to  depends  on 
the  parameter  values  B. 


2.  SIMULATED  CONDITIONAL  LEAST  SQUARES 
ESTIMATES  OF  B 

An  often  used  alternative  estimator  to  the  maximum  likelihood  estimators  for 
stochastic  processes  is  the  conditional  least  squares  estimators  discussed  by  Klimko 
&  Nelson  (1978)  (see  also  Hall  &  Heyde  (1980)).  The  conditional  least  squares 
estimator,  8,  is  the  value  of  B  minimizing  the  conditional  least  squares  equation 

=  £>  -  M«))‘ 

i»l 

over  the  parameter  space  0,  where  fti(B)  -  E®[y<|yi, . . . ,  VJ-i], 

To  obtain  the  simulated  conditional  least  squares  estimator,  (9nm,  the  simulated 

conditional  mean  replaces  the  conditional  mean  in  the  above  equation.  In  other 

words,  the  SIMEST  estimator  based  on  the  conditional  least  squares  equation  is 
* 

the  value  B  which  minimizes  Sn,m(B)  over  the  parameter  space  0  where 

-  -*(, „(«))’ 

•  i»l 

V 

and  Xiim(9)  is  defined  in  the  previous  section. 

Since  X(im(B)  is  the  average  of  m  i.i.tl.  random  variables  with  expectaion  ft((B) 
as  m,  the  number  of  simulations,  approaches  infinity  X(>m(B)  — • ft ((B).  Hence,  the 
simulated  conditional  least  squares  estimator  maintains  the  same  properties  as  the 
conditional  least  squares  estimator  for  large  m. 
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2.1  Properties  of  the  Simulated  Conditional  Least  Squares  Estimator 

Under  certain  regularity  conditions,  Klimko  and  Nelson  (1978)  show  that  8 
exists,  is  a  strongly  consistent  estimator  of  8 ,  and  is  asymptotically  normally  dis¬ 
tributed.  Specifically, 


n 


i/2(0  _0)  MVN^O.V^W"1) 


where 


1  n 

Vkxk  =  lim  -  V'g.g, 

n— >oo  n  ^ 

nm  2 


and  g,‘  is  a  k  x  1  vector  representing  the  derivative  of  the  conditional  mean  m(0) 
with  respect  to  the  parameter  vector  8.  Also, 


n-*oo  Tl  “*■' 
n«2 


As  the  number  of  simulations,  m,  goes  to  infinity,  8nm  has  the  same  asymptotic 
properties  as  8,  It  can  be  shown  that  for  large  fixed  m, 

-  «) 


is  approximately  distributed  as  a  Multivariate  Normal  random  vector  of  dimension 
k  with  0  mean  and  covariance  matrix 

v-\i  +  -Li)wv-\ 

m‘ 

The  regularity  conditions  of  Klimko  and  Nelson  (1978)  must  be  met  for  the 
above  resultr  on  the  simulated  conditional  least  squares  estimator  to  hold.  It  is 
important  to  note,  however,  that  in  the  SIMEST  situation  often  the  regularity  con¬ 
ditions  can  only  be  checked  empirically  through  simulation  since  transition  proba¬ 
bilities  are  never  explicitly  stated.  For  birth  and  death  processes  with  a  limit  on  the 
population  size,  the  regularity  conditions  are  met  if  the  birth  rate  is  greater  than 
the  death  rate,  If  the  regularity  conditions  are  not  met  then  multiple  realizations  of 
the  process  must  be  observed  before  one  can  estimate  the  parameters.  If  multiple 
realizations  are  observed,  SIMEST  estimators  can  still  be  obtained. 
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2.2  Weighted  Simulated  Conditional  Least  Squares  Estimates 


At  each  stage  of  our  optimization  we  can  easily  compute  a  consistent  estimator 
of  <rf(9)  by  computing  the  sample  variance  of  X{ti(0), . . .  ,Xi)tn(0).  Therefore,  it  is 
a  simple  matter  to  find  the  weighted  simulated  conditional  least  squares  estimator 
by  minimizing 

«,„(«)  = 

i«l 


where  -  Xiim(0))2.  For  large  fixed  m,  the  result¬ 

ing  estimator  0W  is  approximately  normally  distributed  with  mean  vector  0  and 
covariance  matrix 


m4 


In  practice  V  is  obtained  by  estimating  the  gradients  via  central  differences 
using  a  large  number  of  simulations,  then  computing 


However,  using  the  method  of  Glynn  (1000)  in  conjunction  with  a  large  number 
of  simulations  leads  to  efficient  estimation  of  the  gradients,  thereby  yielding  the 
optimal  variance  estimate. 


3.  DOES  THIS  METHOD  WORK  IN  PRACTICE? 


A  Modest  Simulation  Study  and  an  Example 

Extensive  simulation  studies  were  conducted  by  Ensor,  Bridges  and  Lawera 
(1003).  Their  simulation  studies  focused  on  simulated  least  square  estimates  instead 
of  conditional  least  square  estimates  but  cleurly  indicated  the  utility  of  the  SJMEGT 
procedure.  To  investigate  the  usefulness  of  the  simulated  conditional  least  squares 
estimation  procedure,  this  method  of  estimation  was  repeated  numerous  times  and 
summary  statistics  of  the  replicated  estimates  obtained.  The  model  used  consisted 
of  a  linear  death  rate  /in  =  fin  (with  fi  --  .1)  and  birth  rate  A„  =  (1000  — n)nA  (with 
A  =  .1).  The  Nelder-Meade  (19G5)  optimization  routine  was  used.  One  thousand 
replicates  of  the  simulated  weighted  conditional  least  squares  estimation  described 
in  Section  2.2  with  m  =  500,  resulted  in  a  mean  of  .0978  with  standard  deviation 
.006261  for  the  parameter  A  =  .1  and  a  mean  of  .1047  with  a  standard  deviation  of 
.03604  for  the  parameter  fi  =  .1.  The  average  of  S*i  m(0)  for  the  1,000  replications 
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was  12.81  with  a  standard  deviation  of  4.00.  Comparable  results  were  obtained  for 
the  simulated  conditional  least  squares  estimate  based  on  m  =  500  and  m  =  2500. 
For  this  particular  model,  the  conditional  variance  at  each  time  point  is  relatively 
constant  hence  the  weighted  conditional  least  squares  estimate  does  not  provide 
significant  improvement.  Often  this  will  not  be  the  case. 

In  addition  to  repeated  replications  of  the  various  estimators,  we  examined  the 
asymptotic  properties  for  one  realization.  Estimating  the  gradient  for  the  covari¬ 
ance  matrix  via  central  differences  based  on  10,000  simulated  values  we  obtained  a 
standard  error  of  .0103  for  the  parameter  estimate  of  A  which  for  this  realization 
was  .09068  and  a  standard  error  of  .0517  for  the  parameter  estimate  of  fj.  which 
was  .0893  for  this  realization.  The  correlation  between  the  two  estimates  was  -.14. 
Again,  we  note  that  better  estimates  of  the  covariance  matrix  can  be  obtained  using 
the  method  of  Glynn  (1990). 

As  mentioned  in  the  introduction,  Bridges,  Ensor  and  Raman  (1994)  model 
the  number  of  customers  for  a  particular  home  inspection  firm  in  the  Los  Ange¬ 
les  area  as  a  birth  and  death  process.  The  data  consists  of  annual  observations 
of  the  number  of  customers  and  information  on  both  direct  and  indirect  adver¬ 
tising  costs  for  the  first  13  yeare  of  the  company’s  existence.  Direct  advertising 
consists  of  such  costs  as  yellow  pages,  brochures,  etc.  Indirect  advertising  primar¬ 
ily  consists  of  the  cost  of  networking  with  the  real  estate  agents  in  the  area.  The 
marketing  model  proposed  was  a  birth  and  death  model  with  constant  death  rate 
fin  =  /m  and  birth  rate  which  depended  on  both  types  of  advertising,  namely 
\n  —  (N  —  n)(X i(s/cid)  4-  Aa(v/«7)),  where  a,i  and  «,•  represent  the  direct  and  in¬ 
direct,  respectively,  advertising  expenditures  at  the  current  time.  The  advertising 
expenditures  are  linearly  interpolated  between  years  to  yield  a  continuous  function 
of  time.  The  maximum  number  of  potential  customers  N  is  assumed  to  be  50,000. 
Using  the  simulated  weighted  conditional  least  squares  estimator  described  in  Sec¬ 
tion  2.2  we  obtain  estimates  of  .0000  (standard  error=,0000)  for  Ai,  .1934  (standard 
errors*. 003375)  for  Aj,  and  11.83  (standard  error— .2205)  for  /*.  For  this  example, 
the  correlation  between  the  estimate  of  the  indirect  advertising  coefficient  and  the 
exit  coefficient  is  very  high,  namely  .986.  Again  the  standard  errors  and  correlation 
are  found  from  the  asymptotic  covariance  matrix.  As  hypothesized  by  the  mar¬ 
keting  researchers,  direct  advertising  (coefficient  Ai)  does  not  affect  the  number  of 
customers  the  company  obtains. 
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4.  SUMMARY 


We  have  presented  nn  alternative  method  of  estimating  the  parameters  of  a 
stochastic  process  when  a  closed  form  representation  of  the  conditional  expected 
value  of  the  process  is  not  available.  This  method  of  estimation  is  comparable  to 
conditional  least  squares  estimators  of  the  parameters.  Klimko  and  Nelson  (1Q73) 
compare  the  performance  of  conditional  least  squares  estimators  and  maximum 
likelihood  estimators  in  similar  scenarios. 

The  simulated  conditional  least  squares  estimator  is  preferred  over  the  pre¬ 
viously  proposed  simulated  least  squares  estimator  (Ensor,  Bridges  and  Lawera 
(1993)).  To  obtain  the  simulated  least  squares  estimator  one  must  simulate  multi¬ 
ple  realizations  from  an  initial  starting  point.  This  method  of  simulation  can  lead 
to  high  variability  in  the  sample  mean  path,  thereby  leading  to  instability  in  the 
least  squares  criterion  function  which  iB  minimized.  However,  the  simulated  con¬ 
ditional  mean  is  very  stable  for  a  moderate  number  of  simulations  resulting  in  a 
critorion  function  with  very  little  noise  at  a  given  point  in  the  parameter  space. 
The  gain'  in  stability  is  due  to  the  fact  that  the  estimate  of  the  conditio  ial  mean 
at  a  particular  time  is  independent  of  the  estimate  of  the  conditional  mean  at  any 
other  time;  whereas,  the  estimate  of  the  mean  at  a  particular  time  is  dependent  on 
the  previous  history  of  the  process.  This  independence  also  facilitates  the  proofs  of 
the  asymptotic  properties  of  the  simulated  conditional  least  squares  estimators. 
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A  New  Series  of  Supersaturated  Designs 

Supersaturated  designs  are  factorial  designs  in  which  the  number  of 
factors  exceeds  the  number  of  observations.  Such  designs  preclude  the 
possibility  of  complete  orthogonality,  making  near  orthogonality  the 
obvious  goal.  In  the  present  paper,  designs  are  constructed  using  a 
new  and  general  method.  These  designs  surpass  previous  designs  in 
all  cases  but  one.  Design  matrices  are  presented  in  the  appendix. 

1  Introduction 

There  are  many  settings  in  which  it  is  desirable  to  examine  the  effects  of  a 
large  number  of  factors  simultaneously.  Plackett  and  Burman  (1946)  devised 
optimal  designs  for  studying  /  »  n  -  1  factors  with  n  observations.  These 
designs  are  completely  pairwise  orthogonal.  A  natural  extension  of  their 
work  involves  studying  a  number  of  factors,  /,  greater  than  the  number  of 
observations,  n.  Such  designs  may  be  useful  when  it  is  necessary  to  examine 
the  influehce  of  many  factors,  and  observations  are  expensive  to  collect  or 
are  otherwise  limited.  When  /  <  n  complete  orthogonality  is  achievable,  but 
when  /  >  n  the  goal  is  to  obtain  a  design  matrix  where  the  columns  are  as 
nearly  orthogonal  as  possible. 

One  early  approach  to  this  problem  was  the  method  of  group  screening 
proposed  by  Watson  (1961).  The  method  involves  combining  f  factors  into 
g  groups.  Each  group  is  then  tested  as  a  single  factor  in  a  standard  design. 
If  the  effect  of  a  grouped  factor  is  significant  its  component  factors  are  then 
tested  individually. 

Another  approach  to  this  problem  is  the  method  of  supersaturated  de¬ 
signs.  A  supersaturated  design  is  a  single  design  matrix  for  which  /  is  greater 
than  n.  The  first  approach  to  supersaturated  designs  was  that  of  Satterth- 
waite  (1959),  who  suggested  randomly  selecting  the  design  vectors.  Later, 
Booth  and  Cox  (1962)  devised  optimality  criteria  and  a  method  for  generat¬ 
ing  supersaturated  designs.  One  of  their  criteria,  near  orthogonality ,  involves 
minimizing  the  maximum  absolute  value  of  the  dot  product  of  all  pairs  of 
vectors.  Of  the  designs  that  achieve  this  criterion,  one  then  selects  the  design 
that  minimizes  the  number  of  pairs  of  vectors  with  this  dot  product.  Booth 
and  Cox  show  that  a  dot  product  of  four  is  a  lower  bound  for  all  designs  with 
/  >  n.  Note  that  near  orthogonality  is  a  minimax  procedure  and  produces 
designs  such  that  no  pair  of  vectors  is  highly  correlated.  Booth  and  Cox 
proposed  a  second  criterion,  denoted  E(j3),  which  is  the  mean  of  the  squared 
pairwise  dot  products.  E[s3]  results  in  designs  with  a  few  highly  correlated 
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vectors,  but  most  are  pairwise  orthogonal.  For  designs  in  which  no  pairwise 
dot  products  are  larger  than  4,  near  orthogonality  and  E[s8]  are  equivalent 
criteria  in  that  they  yield  the  same  design. 

Booth  and  Cox’s  method  begins  with  a  Plackett  and  Burman  (1946)  or* 
thogonal  design,  to  which  they  add  /-n+1  randomly  generated  trial  vectors, 
resulting  in  an  n  x  f  initial  design  matrix.  Next,  they  determine  the  pair  of 
vectors  with  the  greatest  dot  product,  and  attempt  to  replace  each  of  the  two 
vectors  with  a  new  randomly  generated  vector.  A  vector  is  replaced  if  the  re* 
suiting  design  matrix  is  superior  in  terms  of  the  near  orthogonality  criterion. 
Booth  and  Cox  continued  this  process  until  30  minutes  of  clock  time  on  the 
University  of  Loudon  Mercury  computer  passed  without  an  improvement  to 
the  design  matrix.  Booth  and  Cox  used  n  «  12,  18,  24  and  /  as  large  as  2n. 

Rosenberger  and  Smith  (1984)  focused  on  very  small  designs  (/  *■  4,  5, . . . ,  9 
and  n  <  /).  For  these  designs,  they  were  able  select  the  best  design  accord* 
ing  to  the  near  orthogonality  criterion  by  means  of  an  exhaustive  search.  For 
larger  designs  this  approach  is  computationally  untenable.  For  example,  the 
number  of  possible  designs  when  /  =*  24  and  n  »  12  is  on  the  order  of  1047. 

Lin’s  designs  (1993)  involve  selecting  a  half  fraction  of  a  Placket  and  Bur* 
man  design  matrix  of  size  2n.  The  resulting  matrices  have  n  observations  and 
/  m  2n  -  2  factors.  He  examined  all  such  half  fraction  designs  resulting  from 
a  given  Plackett  and  Burman  design  and  reported  the  best  design  according 
to  the  near  orthogonality  criterion.  To  obtain  designs  with  /  <  2n  -  2,  Lin 
selected  a  subset  of  the  columns  from  his  /  »  2n  -  2  design.  Lin  used  a 
variety  of  n’s  between  8  and  30, 

In  the  present  paper  we  seek  a  general  method  that  improves  upon  ex* 
isting  supersaturated  designs  with  n  «  4 fc,  such  that  k  is  a  positive  integer. 
This  is  the  class  of  designs  for  which  pairwise  orthogonality  is  possible  and 
for  which  corresponding  Plackett  and  Burman  designs  exist. 

2  Method 

We  begin  our  method  by  creating  a  matrix  of  /  randomly  generated  design 
vectors.  The  next  stage  involves  a  series  of  passes  designed  to  improve  the 
initial  matrix.  Each  pass  examines  each  of  the  /  vectors  in  sequential  order. 
When  a  vector  is  examined  it  is  compared  to  a  set  of  alternative  vectors  with 
respect  to  the  resulting  near  orthogonality  criterion  for  the  entire  matrix.  If 
a  superior  alternative  vector  exists,  the  original  vector  is  replaced  by  the  best 
of  the  alternatives.  The  series  of  passes  continues  until  a  pass  occurs  that 
fails  to  improve  any  of  the  /  vectors.  For  n  =*  8,  12,  16,  and  20  the  set  of 
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alternative  vectors  is  composed  of  all  possible  design  vectors.  For  n  >  24, 
this  is  not  computationally  practical.  The  set  of  alternative  vectors  consists 
of  a  randomly  sampled  subset  of  all  possible  design  vectors,  Note  that  this 
method  is  sufficiently  general  to  be  used  with  supersaturated  designs  of  any 
dimension  and  with  any  optimality  criterion  based  on  pairwise  dot  products, 
such  as  E(j3]. 

3  Results 

Wegenerated  designs  for  n  »  8,  12,  16,  20,  and  24,  and  corresponding  sets 
of  /  :  n  <  /  <S  2n,  A  summary  of  the  designs  obtained  appears  in  tables  1-5. 
The  actual  design  matrices  are  given  in  the  appendix.  Note  that  uEEMn 
refers  to  Ehm,  Elliott,  and  McGee  and  denotes  our  method.  The  column 
heading  “0"  refers  to  the  number  of  pairwise  dot  products  equal  to  0,  the 
column  heading  “4”  refers  to  the  number  of  pairwise  dot  products  equal  in 
absolute  value  to  4,  and  so  forth. 


n  a 

8  El 

EM 

JL 

0 

” 

4 

12 

20 

14 

63 

28 

16 

T6 

44 

Table  1:  Designs  with  n  =  8 


Table  1  refers  to  designs  with  n  *  8.  In  this  case,  Booth  and  Cox  do  not 
present  a  design  and  Lin’s  method  cannot  be  used. 
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Table  2:  Designs  with  n  *  12 


Table  2  refers  to  designs  with  n  =  12.  Here,  all  three  methods  have  been 
applied.  In  Figure  1,  we  plot  the  proportion  of  pairwise  orthogonal  vectors  in 
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Figure  1:  Proportions  of  Pairwise  Orthogonal  Vectors  lor  the  EEM,  Lin,  and 
Booth  and  Cox  Methods 

order  to  compare  the  designs  graphically.  Clearly,  both  our  method  and  Lin’s 
are  superior  to  Booth  and  Cox’s  method.  Our  method  surpasses  Lin’s  in  all 
cases  except  when  /  *  2n  -  2,  the  case  for  which  his  method  is  designed. 

Tables  3  and  4  refer  to  designs  where  n  *  10  and  n  =  20,  respectively.  In 
these  cases,  Booth  and  Cox  do  not  present  a  design  and  Lin's  method  cannot 
be  used. 

Table  5  refers  to  designs  with  n  =■  24.  Here,  all  three  methods  have  been 
applied.  With  /  »  30,  our  design  clearly  surpasses  Booth  and  Cox’s  design 
according  to  the  near  orthogonality  criterion.  Also,  with  /  =s  46  our  design  is 
superior  to  Lin’s  in  terms  of  near  orthogonality.  This  case,  where  /  *  2n  -  2, 
is  the  case  for  which  his  was  specifically  designed.  For  designs  in  which  some 
pairwise  dot  products  are  larger  than  4,  E[sa]  and  near  orthogonality  are  no 
longer  equivalent  criteria.  As  discussed  earlier,  these  criteria  produce  designs 
with  different  characteristics.  In  light  of  this,  one  could  apply  our  method 
using  E(d3]  as  the  criterion  if  one  preferred  designs  having  a  higher  proportion 
of  pairwise  orthogonal  vectors,  but  also  having  some  pairwise  dot  products 
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equal  in  absolute  value  to  8, 

As  seen  in  Figure  2,  the  proportion  of  pairwise  orthogonal  vectors  at* 
tain  able  decreases  steadily  as  the  number  of  factors  in  the  design  increases. 
We  suspect  this  reflects  the  inherent  geometry  of  the  problem.  Note  in  flg* 
ure  1,  the  proportion  of  pairwise  orthogonal  vectors  in  Lin’s  designs  does 
not  decrease  substantially  as  /  increases.  This  suggests  that  Lin’s  method  of 
selecting  subsets  of  designs  where  /  *  2n  -  2  to  obtain  designs  with  smaller 
/  is  inadequate.  Another  observable  trend  is  that,  for  a  given  number  of 
factors,  one  obtains  a  slightly  better  design  with  larger  n.  The  magnitude  of 
the  effect  of  /  is  larger  than  that  of  n. 


Proportion  of  Pairwise  Orthogonal  Vectors  (EEM) 


Figure  2:  Decrease  in  the  proportion  of  attainable  pairwise  orthogonal  vectors 
as  the  number  of  factors  increases 


4  Discussion  and  Conclusions 

We  will  consider  our  method  in  comparison  to  competing  methods.  First 
we  will  compare  our  approach  to  that  of  Booth  and  Cox.  Both  methods  are 
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large  scale  search  procedures,  but  ours  has  a  natural  stopping  criterion  where 
as  theirs  is  arbitrary.  Because  both  are  general  algorithms  they  can  be  used 
for  a  wide  variety  of  n  and  f.  For  comparable  designs,  our  method  produces 
uniformly  superior  results. 

We  suspect  the  following  observations  may  explain  why  we  obtain  better 
results  than  Booth  and  Cox.  We  found  that  starting  with  an  orthogonal 
matrix  made  it  difficult  to  add  vectors  with  small  dot  products,  while  starting 
with  randomly  generated  design  vectors  produced  superior  results.  We  also 
achieved  better  results  when  sequentially  considering  each  of  the  /  vectors 
for  replacement  rather  than  replacing  the  vectors  at  random. 

We  now  compare  our  method  to  that  of  Lin.  Although  computationally 
simple,  Lin’s  method  is  highly  specialized  and  can  only  be  applied  to  a  limited 
number  of  values  of  n.  For  given  n  it  only  produces  designs  for  /  <  2n  -  2. 
Furthermore,  Liu’s  method  for  deriving  designs  for  /  <  2n  -  2  gives  poor 
results.  This  is  consistent  with  our  own  findings.  'The  ideal  set  of  design 
vectors  changes  so  dramatically  from  one  level  of  /  to  another  that  even 
the  best  subsets  of  larger  matrices  do  not  yield  good  designs.  This  further 
suggests  that  methods  specialized  for  particular  values  of  n  and  f  are  unlikely 
to  produce  good  designs  for  other  combinations  of  n  and  /.  Our  method 
generates  design  matrices  that  exceed  Lin’s  in  all  cases,  but  one. 

Our  designs  for  n  *  16  and  /  *  32  is  the  largest  supersaturated  design 
published  for  which  all  dot  products  are  less  than  or  equal  in  absolute  value 
to  the  theoretical  minimum  of  4. 

5  Future  Work 

This  research  was  inspired  by  a  problem  posed  by  the  late  Dr.  Carl  Bates. 
He  needed  to  estimate  the  effects  of  104  factors  using  52  observations.  The 
104  factors  were  parameters  in  a  model  and  the  observations  were  the  52 
Sun  workstations  to  which  he  had  access.  In  order  to  solve  this  problem 
we  hope  to  thoroughly  investigate  designs  where  n  >  20  with  respect  to  the 
near  orthogonality  criterion.  We  also  hope  to  produce  designs  in  which  the 
maximum  pairwise  dot  product  is  4  for  n  >  20.  Finally  we  will  investigate 
the  class  of  designs  generated  by  our  method  when  the  optimality  criterion 
is  E[s3]. 
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Table  3:  Designs  with  n  a  16 


fj£g£v 

n  = 

24  EEM 

n  = 

:  24  Lin 

n  a 

24  B 

&  C 

JJ 

0 

mm 
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MM 
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0 

n 
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178 

257 

T" 

295 

81 

59 

183 

311 

2 

202 

351 

8 

36 

202 

416 

12 

38 

223 

460 

20 

40 

243 

507 

30 

42 

261 

40 

44 

273 

623 

50 

- 

- 

46 

309 

662 

64 

414 

552 

69 

- 

- 

- 

48 

337 

711 

80 

- 

- 

- 

- 

- 

- 

Table  5:  Designs  with  n  =  24 
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6  Appendix 

The  design  matrices  generated  by  our  method  are  given  below.  In  each 
matrix,  the  rows  correspond  to  the  observations  and  the  columns  to  the 
factors. 

-44-  -  -  +  •  *444 

-  -  *  *  444*4*4* 

.  444*4*44*44 

4*4  4  4*  *  *  4  -  4 

4  4*  *4  4  4  4*  *  -4 

4  4*4*4*  •  •  4  • 

. •  4  4  4 

4-444-4444-  - 

n  =  8  /  =  12 

*  4  *  ■  -44-4*  •  4  -  4 

4  4  4  4*  *  •++■  *  *44 

-  4  •  •  4  *  *  -  •  4*  *  4  • 

4  •  4*  4*  4  -  4"  •  •  *  +  • 

at  at  «  -J»  •  h  *|»  *|»  m  m  m 

4  -  -  -4  4  4  4-  -  4-  4-  + 

.  -4  4  4  4-  -4  4  4  4  4  4 

4  4  4-  -4-4*4  4  4-  - 

n  =  8  /  =  14 

4*4*  *44*  *4*  *  ■  4  ■ 

4*4*4*  -  4*  *44*444 

44*4*4*  -  -  44444*  ■ 

-44-4  *4  -  -  4-4  4-  44 

4  4*4*  -  4*  *4*  *  *4 

4  -  -  44-4-4  -  -  -  44  -  - 

-4  -  -  44-444  -  -  -  4-  4 

4-4  -  -  4  4  4444-4-4- 

n  =  8  /  =  16 
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4  *  4  *  4  4  4  * 

+  •  -  +  ■  *44 

-4  4  4  4*  *  4 

+  +  -  ■  4  4  -  • 

4-44*4-4 

•  4-44* 

•  *  ■  -  4  4  -  4 

44*4*44* 
.  .  4  4  4*4* 

+  4  4  -  -  -  -4 

-44*  *44* 

n  -  12 


-  -  -  -  +  •  4  -  4 

4  4*  *  *44* 

•  “I*  ■  ■  •  •  •  m 

4  *  *4*4  4  4* 

4  4  4  4  4  •  4  4  4 

4  •  •  4  +  4  *4 

4  4  4  4  *  *  4*  4 

.  -  4-  •  4-  4-  -  4-  - 

4-  4-  4-  -  -  4  -  4  - 

■  -  4-  4"  4*  •  4 

■  4"  4*  *  4*  4*  *4" 

. 4-4-4- 

n  a*  12 


4  4*  .  .  •  4-  4 

4  4-  4-  4-  * 

4-  4-  -  -  4 

*  4-  •  4-  4-  4*  4 

4-  4-  4-  4-  4-  4-  +  - 

-  -  *  4-  4-  4-  4-  - 

*  *  ■  4*  *  -  4-  4* 

■  -  4 . 

-  -  4*  ■  4-  4 

4*4  4  4  4*4 

4  -  4-  -444 

*  4*4*4* 

/a  16 


4-4*  -  *4*4 

44*4*44* 

*  •  4  4  4*4  4* 

*  *4*44*  *4 

4*  *  *4  4  4  4* 

*  4*44*44 

*  •  ■  ^  ■  ■ 

4*  *44- 

4  4  4*  -  -4  4  4 

4  4  4  4-4-4  4 

*  4  4*4  4* 

*  4  *****  4  ■ 

/a  18 
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4  4 


+  -  -  +  -  -  + 

-  +  *+  +  +- 

+  +  •  +  -  + 

+  -.+  -  +  -  + 
•  -  +  •+  +  + 

+  +  +'+  +  *  + 

. +  - 

+  •  *  *  +  + 
•  -*++*- 

+  +  + 

•  +  +  +  +  +  + 


+  -  +  +  -(-  + 

+  •  + 
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'  -  +  *  + 

+  +  -  -  •  + 

+  +  *  *  + 

+  +  +  ■  ■  ■ 

•  •  *  ■  + 

.  -  +  +  *  + 

•  +  •  •  +  - 

+  +  +  +  + 

n  =  12  /  =  20 
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+  -  +  -  -  +  • 

-+  +  +  +-  - 

+  +  •  ■++- 

+  •  +  +  •  + 

+  +  +.  +  .+ 

+  •  •++••+• 

.  -  .  -  -  +  -f 

+  -•-++■ 
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4  ■  -  4-  4  -  *  4 
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.  .  .  .  +  -  4  4 


*  4  4  *  +  +  4  -4 

.  +  -  -4-44-44 

4 . 4  •  *  4 

-  .  4  4  -  *  4  •  •  •  4 
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.  .  4  4-4-4  4  4* 

4  4-4  4-  -44-4 
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4  4  4  4 . 4- 
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n  =  12  /  =  22 
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4  4  *  ■  4 
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.  .  4. 

4  +  4  •  4 

■  4  4  4  4 

4  ■  ■  4  4 
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4  4*  -4  4  4  4  4 
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*  ■  -  *  4  4  4-4 
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4  4-4  4  4-4  4 

*  -  -  -  4  4*** 

*  ■  4  4  4-  *  4 
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-  4  4-4  4  4  4* 
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4-  4-  -  4*  4*  4*  -  •  4*  ■  4*  •  4*  *  4" 

.  .  4*  *  4*  +  *'  *  *  4-  4*  •  4* 

•  +  •  4*  *  4*  •  4-  -  ■  ■  *4* 

«  •  4-  4-  4*  4- . 4-  4-  4* 

4-4*  -  4*  -  4-  •  -  4*  4*  ■ 

n  a  16  /  =  30 


40 


+  +  +  +  +*  “  4*  ■  *  +  +  4“  *  4-  • 

+  •  •  4-  •  *  *  ■+  +  +•  ■  *4* 

*  +  4-  -  4-  +  4-  4-  *  *  •  *  •  +  • 

4*  +  *  •  -  +  *  + . 4-4- 

■  -  •  +  ■  +  *+  +  +  +■  *  4-  4- 

+  +  +■++•  •  4-  *  4-  •  4*  -  4-  • 

m  a  a  a  a«|a«ja«^»«a^aa  a  a  »  ^ 

•  •  +  *  +  •  •+  +  +•+  +  +  +  + 

-  •  +  +  +  +*  -  +  *+  +*  •  •  + 

+  «  •+  +  +  +*  *  +  •  •  4"  4* 

•  4-  •  4-  •  •++■  -  4-  4-  4-  *  4-  4- 

4-  ■  4*  *  •  4*  •  *  *  •  +  4-  4-  - 

4*  4"  *  +  +  +  +  4'  +  +  +  +  +  +  *  + 

•+  +  +•++  :  \  +‘  +■  +  •++  • 

4-  4*  4-  . . 4-  4*  *  4-  * 

•  »  •  ■  •  "ji  ■  ^ 

mm  m  >1*  «|b  ■  a^.aa^»^aa»|aa 

•  a^a  -^a  •  •  >*|a  a  *^a  a  a|a  "j"  ^  • 

•  -  +  •  +  *  +  •  +  •  +  •  •  4*  4* 

■|a  a  ■  a  a  a  -  a  a|a  «  "  "j“  "f*  "  • 

4-  4-  4-  •  4-  4-  •  *  4*  4*  *  •  4-  *  4*  • 

4-4*4-4-4*4,"4*-4,4-*  •  4*  •  4- 

-  -H  *  4"  4-  4-  •  •  •  +  ■  ■41 

•j*  ■  ■  m  m  «^a  ijiaaa^aaaa  ^ 

«|i  >|i  a  a  a  "I*  •  *  ■  a  •  ■ 

4-  -  4*  4-  -  -  4-  •  -  4-  +  +  +  ■  +  + 

4*4’4,4*4-*4‘4,4<*  +  *  4**  •  + 

a  a^a  a  a  a  a  ■  ■  m  a  ^a  a  a 

.  .  .  4.  .  >  4*  4*  +  *  •  -  4** 

■  4*  •  4-  •  4-4-4-4,4’4,4,4‘4-4-4- 

.  .  4-  -  4-  -  -  41  4-  •  .+4..  -  4* 

4-  -  -  4-  4*  -  4-  4-  4-  -  4-  -  4-  -f  4- 

n  »  16  /  ■  32 


/si 


41 


a  a  a  a  a  a|*  a  a  a  a  a 

4  +  4  ■  4  -  4  -  -  4 

.  .  +  -  .  -  +  4*4  4  4*  4 

+  -  *  4  4  *  4  4  4  4  •  *  4  4 

•  4*  +  *+  4  +  +  +*  +  *4  + 

+  •  4  ■  *4* . 4  +  4 

4*  *  +  •  •++•  +  *+  +■ 

a  a  a  «  "|a  a|»  ^  a  •  a^  a  a  a  a  aja 

•  4  4  •  4  *  4  *  4  •  4  •  *4* 

4  +  4  +  +  +  +  +  +  +  +  +  +  4 

+  4*4*  •  4  4  4  *  4  •  4  -  4 

+  -+  +■ . +  +  +•+  + 

•  +  +•  +  ■  *+  +  +*++* 

a^a  «|a  a  aja  «|a  a  •  a  ■  ^a  a  a  a  a 

■  •  +  +  +  +  +*  +  •+  +  +■ 

*  -  4  4*+  +  +  + . 4 

.  .  .++.  .  .  +  .+  +  +.+ 

4+**  +  *  *4**4***“ 

■  +  -+  +  + . 4*44 

+  4  +  4*4*  *  +  •  *  *++■ 

*  +  *  +  •+  +  +  +  +■  *44* 

a  a  a^a^aa^a|aa«^a|a«^a^»  a 

+  +•  +  •  •  •  +  •  •  ■  *4*4 

4*  +  +  +*+44*  +  4*  •  + 

*  +  4-  •  4  4  4  4  •  •  +  •  *4 

4*4*  +  *+  +*  + . 

-  -  44  -  -  4-  *  *  4  4  4  4  - 

.  -  4-4  4-  *  -  4  -  4  4  *  4 

*  +  +*  +  -4*4*4-+  +  4 

4-4  -  •  4  *  *  ■++*++• 

*  ■  -  +  +  +*  +  *  +  ■  •  *4  + 

.  +  .  -  +  +  -  -  4  4  4  4  •  4  4 

.  .  .  .  +  *44*  *4*4* 

4  +  4-  *  *  *44  +  4*  *  *4 

4  -  -  +  ■  ■  4  *  4  4  *  4  *  *4 

4+ . 4  +  4  +  4  +  +* 

+  +  4  +  4  +  44*  -4  -  4* 

+  +•--++•*••  +  -  +  * 
a  a  a  a  a  a  a  a  a  a  a  a  a 

+  44  +  4+ . +  +  +  + 

n  a  20  /  »  30 


42 


4 

- 

4 

4 

- 

+ 

+ 

• 

- 

4 

4 

4 

4 

4 

- 

4 

4 

4 

4 

• 

+ 

+ 

+ 

• 

+ 

4 

4 

a 

a 

■ 

4 

4 

+ 

• 

■ 

• 

+ 

+ 

+ 

m 

t 

- 

- 

a 

4 

4 

• 

4 

• 

4 

• 

■ 

■ 

• 

+ 

4 

- 

- 

• 

4 

■ 

4 

4 

4 

4 

+ 

+ 

+ 

+ 

• 

• 

4 

4 

4 

4 

4 

4 

• 

■ 

4 

+ 

+ 

+ 

m 

• 

4 

■ 

4 

a 

4 

4 

- 

4 

4 

■ 

+ 

m 

+ 

m 

» 

• 

a 

4 

a 

4 

- 

■ 

4 

+ 

• 

- 

+ 

m 

- 

4 

a 

4 

a 

a 

4 

4 

• 

• 

+ 

+ 

+ 

+ 

+ 

• 

4 

a 

a 

4 

a 

■ 

+ 

4 

4 

■ 

+ 

- 

- 

+ 

■ 

4 

4 

• 

4 

a 

4 

4 

■ 

■ 

• 

* 

+ 

• 

4 

4 

4 

a 

4 

4 

4 

+ 

a 

4 

M 

+ 

- 

- 

4* 

• 

m 

4 

4 

4 

4 

a 

4 

a 

- 

4 

+ 

+ 

■ 

4 

■ 

- 

a 

* 

a 

m 

4 

" 

4 

■ 

■ 

+ 

+ 

• 

- 

4 

• 

4 

4 

4 

- 

a 

4 

■ 

• 

- 

• 

- 

+ 

+ 

4 

4 

4 

4 

4 

4 

■ 

4 

- 

+ 

■ 

• 

+ 

+ 

- 

4 

- 

- 

a 

■ 

- 

- 

+ 

- 

+ 

4 

+ 

+ 

« 

4* 

4 

4 

- 

4 

a 

a 

4 

4- 

+ 

4 

M 

+ 

+ 

m 

• 

4 

- 

4 

4 

a 

a 

a 

+ 

4 

4 

* 

■ 

m 

• 

a 

a 

a 

4 

a 

4 

4 

■ 

M 

4 

m 

• 

+ 

4 

4 

4 

a 

4 

4 

a 

4 

• 

4 

4 

■ 

+ 

■ 

• 

4 

4 

a 

4 

a 

a 

• 

4 

4 

4 

a 

+ 

• 

+ 

+ 

4 

4 

. 

. 

4 

a 

4 

a 

• 

a 

• 

+ 

+ 

- 

- 

• 

4 

- 

- 

a 

4 

4 

a 

4 

4 

• 

• 

+ 

• 

4- 

4 

4 

m 

4 

4 

4 

■ 

4 

4 

+ 

+ 

• 

- 

4- 

- 

- 

4 

" 

4 

4 

4 

4 

4 

- 

- 

■ 

N 

- 

4- 

• 

a 

4 

4 

4 

a 

a 

4 

- 

4 

+ 

+ 

+ 

4 

4 

a 

a 

a 

■ 

4 

• 

- 

■ 

4 

• 

m 

• 

+ 

+ 

4 

4 

4 

4 

a 

a 

4 

a 

4 

■ 

4 

+ 

• 

• 

+ 

- 

4 

■ 

4 

- 

a 

a 

- 

4 

4 

4 

4 

+ 

m 

• 

+ 

• 

• 

4 

a 

4 

4 

4 

4 

- 

■ 

- 

4 

■ 

+ 

m 

+ 

4- 

4 

4 

4 

4 

4 

4 

a 

4 

- 

4 

- 

- 

- 

- 

+ 

- 

- 

• 

- 

• 

4 

- 

4 

• 

- 

4 

- 

+ 

4- 

+ 

- 

- 

- 

4 

4 

4 

4 

4 

4 

- 

• 

+ 

+ 

+ 

• 

4- 

- 

4 

4 

• 

« 

- 

4 

4 

4 

+ 

+ 

• 

+ 

• 

m 

4 

- 

4 

4 

4 

4 

4 

4 

+ 

■ 

+ 

+ 

■ 

• 

4- 

• 

4 

4 

■ 

- 

« 

4 

■ 

• 

+ 

+ 

• 

+ 

■ 

4 

■ 

4 

4 

- 

• 

■ 

4 

a 

■ 

4 

• 

+ 

+ 

+ 

+ 

• 

4 

4 

4 

- 

4 

4 

• 

- 

4 

- 

+ 

+ 

• 

■ 

- 

- 

4' 

- 

4 

4 

4 

■ 

4 

• 

- 

- 

+ 

+ 

+ 

+ 

- 

• 

- 

4 

• 

4 

- 

• 

- 

+  +  +  + 


+  +  *+  +  +■  •  ■  •++■ 

4*  •  ■  -++*  +  -•  -  4*  4"  4-  4-  *  41 

•  ++---  4- . 4-  -  +  +  + 

.  .  .  +  -  .  .  +  .  +  .  .  +  .4.  +  + 

•  •  +  +*+4"  +  +-  -  +  +  +•+  + 

.  .  +  .  +  .  -  4-  4-  4-  -  •  4-  +  •  .  . 

■  +  +•  +  *  *  *  +  ■+  +  +• 

+  +  +  +*  +  •  +  *  +  •  •  •  ■  + 

4*  ■  4-  -  4-  4-  •  4- . +  ■  + 

•  -  -  ++  -  •  -  +  *  +  ■  •+  +  +• 

•  +  •  •  •+  +  +  +  +  +  +■  +  •+  + 
+  +  +  +  +  *  ■  +  •  +  -  +  •  +  -+  + 

+  +  +  +  +*  +  •  -  •++-+  +  +  + 

■  +  •  4*  -  ■  ■  4-  4-  4*  •  4-  4*  *  *  + 

•+■+  +  +  +■•++•■  +  ••■ 
aja  •  a  N  a  a  a^  a  a|a  a|a  •  aja  ■  «|*  aja  a|a  a 

. +  •  *  *+  +  +■  ■++'• 

+  +  +  +•+  +  +  +  +  +  +•  +  -  + 

■  *  *  *  +  +  +  +*  + . 

«+  +  +  +•  +  •  •  +  •  +  •  •  +  ■ 

•  ■  +  +  +  +•  *  *  + . + 

+  +  *  +  *  -  *  *  +  •  *  +  ■  +  •  +  * 

+  +  4-  +  •  *+  +  4,+  *  +  «  ■  4-  • 

a  a  «  a  a  a  a  ^a  a  -j>  -ja  «j»  a  a  a  »ja 

+  +  ■  ■*4*4,"l**  -  •  4*  •  4“  ■ 

■  +  +  ■  -  +  +  +  +  +  + . + 

+  +  +  -  +  -  -  +  •  -  •  +  +  +  -  -  + 

+  -  -  -  +  -+  +  •  +  ■  •++■  +  ■ 

.  .  +  +  •  *  •  •  +  ■  +  •+  +  +  +  + 

+  ■+  +  *+  +  +•  •  +  +  •  +  *  +  + 

-  •+  +  +■+  +  +  •  -  +  -  +  -+  + 

.  .  ++•  +  •  +  ■  ■  ■  ■  +  ■  + 

a  a|a  a  a  a  a  a  a  ^  a  a  a  ^  a  a 

•  ■  -  +  +  +-  •  ■  +  *  ■  •+  +  +■ 
+  +  4*  •  -  -  +  •  +  -  +  -  +  *+  +• 

-  ++-++•  -  -  +  -++-  -  +  - 
+  +  -  +  -  +  -+  +  +  +-  +  +  +-  + 
+  *  ■  -  ++  -+  +  +  +  +•  •  +  +  - 

. +  -+  +  +•-  +  -- 

•  4-  4-  4-  4-  4-  4-  -  4-  4-  >  +  4-  •  4-  4-  4- 

n  *  20  /  -  34 


44 


p  +  *■  +  *+  +*  -•  +  *  +  •*++•  + 

I  *  +  -  -  + . +  *  +  + 

|i  -  -  »  *  -  ~f*  •  ~t~  ~  "4“  ~  “t-  ~f* 

|  +  •-  +  -+.-  -+  +  +  +*  +  -  -  ■  + 

|  •  •  •4,+  -+  +  +  +*  *  -  4*  +  +  +* 

tj;  -  -  -  “*4“  "4~  ~  “  “4“  -  ■+*  ~  *4"  •+“ 

p  +«  +  *  +  *4,+  ”  ■  +  **  +  **  +  * 

’i  •++*+  +  +-  +  *  •*  +  •"•  +  • 

!«  +--  +  *4,+  *+  +  +  +  +‘+  +  +“ 

f  •  +  -+  +  +  +  +  +  +*+  +  +  +--  + 

■  +  +  + . +  --++--+  + 

■i  +•  +  •++•  ■  •  •  +  +  +  ■  •  +  •  + 

-  -  +  +  +  +  +  -I--++-  •  +■+■- 

•  +  -•+  +  +•+  +  +  +  -  +  -+  +  + 

.  .  *  +  +  .  .4.4..  .  +  .  .  .  -  +  . 

++’++*+  +  ,‘“  +  -*+  +  +*H- 

r  •++**  +  *  +  •**++■+  +  +  ■+■ 

. +  +  +  +-  --  ++  ** 

+  +  +*••  +  •  +  •+  +  ••  +  •  +  ■ 

*  +  +  +•  •  *  *  +  •  •  +  •  +  •  + 

.  -f  ■  *  •++■  •  +  *  -  +  +  *  +  + 

-  •  +  ■  +  •  +  ■  •+  +  +  +•  +  •  +  • 

+  ■  ■+  +  +  +  +  ,f+-  +  ‘  ■+  +  +■ 

+  +•  +  *+  +  +  +•+  +  +• . 

-  -  +  +-  ++  •  •  +  ■+  +  +  +■ 

+  +  +  -  •  -  -  +  •  -  ■  -  +  +  +-  +  ■ 

+  -  -  +  4*  + . + . 

+  +  +-  +  •  -+  +  +  +  +  -  -  +  +  -  - 
*  •  -  -  +  +*  -  •  -  +  -  •  -  -f+’  + 

+  +  +  +  +  +*  -  ++  ■  *  +  *  -  ■  +  + 

. +  +  +  +  +  +-  +  *  4- 

+  -  +  -  *4--4--4-4--4-+-4-4-4* 

+  4-  *  -  ■  -  +  +  *  •  +  +  -  •  •  ■  +  4- 

■  +  +  +■  ■++*  ■  +  •++“  +  * 

. +  +  *f  +  ■  -  -  ■  *  4*  •  + 

-  -  +  -  +  -•++-  -  4-  -  +  +  4-  +  - 

+  *  4*  4-  *  4- . 4--4-4-- 

+  +•++*  •  -4-4-4--4-  +  +  +-  4- 

*+  +  +  +  +  +  +  +  +  +  +-  +  -  -  +  + 

n  =  20  /  -36 


45 


4  4  •  4  ■  *4 


m 

- 

« 

4 

• 

m 

m 

4 

■ 

- 

4 

4 

4 

- 

4 

- 

4 

• 

4 

at  - 

4 

- 

> 

• 

+ 

m 

4 

4 

4 

- 

4 

4 

4 

4 

- 

- 

4 

4 

4 

4 

4 

4 

- 

+ 

4 

4 

- 

4 

4 

•• 

4 

4 

4 

- 

• 

4 

- 

- 

4 

4 

4 

■ 

- 

4 

4 

• 

• 

• 

4 

- 

4 

4 

- 

4 

4 

+ 

- 

- 

- 

- 

+ 

- 

4 

4 

4 

• 

4 

- 

4 

4 

4 

« 

4 

+ 

+ 

- 

- 

4- 

* 

+ 

4 

4 

4 

4 

- 

- 

- 

4 

• 

- 

+ 

- 

+ 

+ 

+ 

- 

- 

4 

- 

4 

- 

- 

4 

4 

- 

• 

• 

- 

4 

4 

4 

- 

• 

+ 

4 

- 

- 

- 

- 

4 

at 

- 

4 

4 

4 

- 

4 

- 

+ 

+ 

+ 

- 

- 

4 

* 

- 

- 

- 

4 

4 

4 

- 

+ 

- 

4 

- 

+ 

■f 

4 

4 

4 

4 

- 

- 

- 

4 

4 

4 

- 

4 

- 

4 

- 

- 

+ 

- 

4 

• 

- 

- 

- 

4 

4 

4 

4 

- 

■ 

m 

- 

4 

* 

+ 

+ 

4 

4 

\ 

■4 

• 

4 

4 

* 

• 

m 

- 

4 

m 

- 

+ 

- 

- 

4- 

4 

4' 

- 

4 

.f 

- 

4 

- 

4 

4 

4 

- 

- 

- 

4 

+ 

+ 

+ 

- 

* 

4 

• 

- 

4 

4 

- 

- 

« 

- 

- 

4 

4 

- 

+ 

+ 

- 

4 

4 

4 

- 

• 

- 

• 

4 

4 

- 

4 

4 

- 

4 

+ 

4 

- 

« 

+ 

+ 

• 

4 

- 

4 

- 

4 

4 

4 

4 

* 

4 

at 

- 

+ 

+ 

- 

m 

- 

- 

■ 

4 

4 

4 

4 

4 

4 

• 

at 

• 

• 

4 

4 

• 

- 

4 

- 

+ 

- 

4 

4 

4 

4 

- 

• 

• 

- 

4 

4 

4 

4 

m 

4 

+ 

+ 

+ 

+ 

- 

■ 

- 

4 

- 

4 

4 

4 

• 

4 

4 

4 

• 

n 

■»20 

/ 

.38 

continued, 

on  next  page 

46 


+  + 


. ++--■-  +  -+  + 

-  -  +  +  •+  +  +  +  +  +*++•  •  “  + 

+  •  +  *+  +•  +  •  +  *+  +  +  +  +•  +  • 

*  -  *  +  •+  +  +•  *  ■  *++•  +  *  +  - 
+  +  +  *  *  +  *+  +  -  *  +  *  •  +  ■ 

*  +  -+  *}‘-++*+  +  +-  ++  -  +  - 
+  +  +  +  +•  -  +  +  -  •  +  •  *  -  +  •+  + 
+  + . +  +  +  +*+  +  +•++• 

-  .  -f  -  •  •  +  +  ■  •  +  +  ■  +  +  +  +  • 

*  -  +  •+  +*  *  + . +  -■+••  + 

•  +  +*++*  ■•+  +  *f+*+-f  +  +* 

+  +  •  +  +  +  +  4-  +  +  *  *  ■  •++•  "  + 

+  +■-.+  •  -  +  -+  +  -+  +  + 

•  •  +  •  *++*  +  ■+  +  +  + 

+  +  +  +•  +  •  •  •  +  •  +  •  -  ++  ■  + 
•  •  •  ■!»  "|"  m  m  m  m  «|»  i|> 

-  +  +  +  +  + . +  +  *  +  +  +  + 

*  ^  “  *4*  “f*  "  *  +  ■  "  •  "1“  ■  m 

^  m  m  «j»  «|»  m  ■  m  «  W  m 

n  =  20  /  =  38 


47 


+  + 


+ 

+ 

+ 

+ 

- 

- 

a 

+ 

- 

4- 

- 

4- 

- 

4- 

a 

a 

" 

+ 

• 

+ 

+ 

• 

+ 

+ 

+ 

a 

a 

+ 

4- 

4* 

a 

4- 

a 

a 

+ 

4- 

4- 

4- 

- 

- 

+ 

4- 

"m 

+ 

a 

+ 

4- 

•» 

4* 

4- 

4-. 

4- 

a 

a 

4- 

4- 

4- 

4- 

m 

4* 

+ 

+ 

■+ 

4- 

4- 

a 

a 

4- 

a 

• 

- 

+ 

a 

M 

+ 

4* 

a 

+ 

• 

a 

+ 

4- 

4- 

a 

4- 

4- 

• 

4- 

4- 

m 

- 

4- 

a 

- 

4- 

4* 

4- 

+ 

+ 

+ 

- 

4- 

4- 

4- 

4- 

- 

4- 

4- 

4- 

- 

+ 

4- 

4* 

- 

4- 

4* 

- 

- 

4- 

*  , 

- 

- 

- 

4- 

a 

- 

+ 

4- 

- 

+ 

4* 

• 

+ 

a  ‘ 

a 

4” 

4- 

4- 

a 

4* 

- 

4- 

■ 

• 

- 

4- 

4- 

• 

.+ 

4* 

+ 

+ 

- 

4* 

4- 

- 

■ 

4- 

a 

4- 

+ 

a 

- 

a 

4- 

+ 

a 

+ 

a 

a 

• 

- 

a 

4- 

■ 

a 

4* 

a 

4* 

4- 

• 

+ 

• 

4* 

a 

4- 

4- 

4- 

4- 

4* 

4- 

4* 

4- 

4- 

a 

4* 

• 

a 

+ 

a 

+ 

4- 

4- 

4- 

a 

4- 

4- 

a 

a 

a 

4* 

- 

4- 

- 

4- 

4* 

■ 

a 

4> 

4* 

+■ 

4" 

a 

4- 

4- 

- 

4" 

a 

4- 

- 

■ 

4* 

4* 

+ 

+ 

• 

,  a 

4- 

+ 

- 

4- 

• 

a 

4- 

4- 

4- 

4- 

4- 

- 

• 

4- 

a 

4* 

a 

4* 

4- 

a 

a 

4* 

4- 

- 

a 

4- 

4- 

4* 

*  a 

4- 

4- 

m 

4- 

+ 

- 

4- 

4- 

• 

a 

a 

4- 

4- 

4- 

4- 

4- 

4- 

4> 

a 

n 

• 

+ 

a 

a 

a 

4-. 

a 

+ 

a 

a 

4* 

4- 

a 

4- 

4- 

i 

V 

a 

a 

4- 

+ 

• 

+ 

• 

a( 

4- 

a 

4- 

a 

4- 

a 

a 

• 

a 

a 

4- 

a 

a 

4- 

4* 

- ' 

• 

+  ' 

’  + 

a 

4- 

a 

4- 

a 

a 

4* 

4- 

• 

+ 

- 

a 

4- 

a 

a 

• 

+ 

• 

+ 

a 

a 

4- 

+ 

4- 

a 

a 

4* 

4- 

- 

4* 

a 

a 

a 

4- 

n  =  20  fm  40 


48 


+  + 


4-  4-  *  •  + 

+  •  +  •  ■ 

•  •  m  m|*  m 

+  "  *  * 

•  «ja  m  m  m 

'.+  +-■•  ■ 

•  4-4*-* 

+  +  +  +■ 

•  •  4-  +  ■ 

•  +  •  +  + 

.  ...  4. 

■  4--4,4- 

•++•  + 
•+  +  +  + 
+  -++- 
4-  4*  *  4-  4- 

+  -  +  •  + 

■  m  ^  -ja  a^a 

•  +  +  • 

«  a  •  a|a 


+  *  +  4*  - 

+  +  -  +  + 

■  +  + 

.  .  4.  .  4. 

.  .  .  .  4. 

+  **•  + 

*++•• 

-  •  4-  +  - 

a  a  wjn  a|a 

.  .  .  4.  4. 

+  ••  +  ■ 

4"  4*  + 

4"  +  +  +  + 

'  +  4“  + 

4"  •  •  4*  • 

•  4>  ■  4*  4* 

+ 

+  +■•• 

4.  4.  4.  .  4. 

*  +  •  ■  4- 


+  4-  +  *  + 

4"  4- 

.  4.  .  4.  . 

*  *  +  4-  4* 

■  + 

4.  .  4.  4.  4. 

+  +  +• 

+  +-  +  • 
-  4-  4-  4-  4- 

4■  -  -  4-  • 

4-  4-  4-  4-  + 

+  4-  •  *  4- 

.  .  .  4. 

••  +  ■  + 

4-  ...  4. 

•++•• 

*  -  4-  *  • 

•  *  •  4-  4- 

•  -  +  +  • 

4.  ...  4. 


4-  +  +  •  t 

4-  +  4-  +  • 

4*  +  *  ■  4- 

•  +  +  4-  - 

■  +  ■  •  4* 

*  ■  •  4*  4* 

■  •  4-  •  - 

.  .  .  4.  4. 

4-  ...  4. 

4-  •  4-  4-  - 

4-  ,4-  -  ■  • 

4-  4-  4*  4* 

4*  4*  4*  *  4* 

.  4.  .  4.  . 

.  4*  4-  •  4* 

4-4-4- 


.  .  .  4.  4. 

4-  .  4.  .  . 

4.  .  .  4.  . 


n  ae  20  /  =  40  continued  on  next  page 


49 


•  +  •  +  •+  +*+  +  +•  *+  + 

•  4-  4-  4-  *  •+  +  +  +  +  +*  -  - 

+;+  +  ■•  *  4-  4-  4-  •  *  -  +  +  +  + 

+  *  +  *  ■++*+  +  +* 

"  +  ••*+  +  +■*----- 
4-  +  *  -  +  -+  +*  *  4-  +  -  -  - 

••+  +  + . 4-4- 

••+  +  +  +  +  +•  -  4-  *  4-  •  * 

+  -  +  ■  *  *  *+  +  +■  +  ■  +  • 

+  *.++•  +  *  -  *  -  •  4-  +  *  4* 

+  +  •  +  +  +  ■  +  +  +  +  +  +  +  + 

•++•  -  +  +  •  •+  +  +  +  +  + 

■  ■  •  **|*  a  •  •  a  a  a  a|a  a|>  a  a 

“  +  •  +  ■  ■  •  4*  •  +  •  *  4*  + 

+  +  +•_+  +  •++•.■•  +  •  + 

•  •  4*  •  +  -  4-  4*  ■  +  ■  4*  ■  *4* 

•4-  4-  -  4-  4-  4*  ■  •  +  +  41  •  +  • 

a«|aa  a  a  a  .a  aa^taajaa^^aaja 

m  '  m  a  i  ^  a  i|«  ^  i  a  a  a  a|* 

+  4"  4*  +  4*  *  *  4*  4-  *  •  4* 

4*  •  •  4*  •  4*  *  •  4-  4-  *  4-  •  4’ 

■  *  +  4*  4-  -  4*  -  *  4-  •  +  *  4-  4- 

4*  ■  •  •+  +  +  •  +  •  •  +  •  4* 

4-  *  +  *  *  4*  4*  4"  •  •  •  4"  4*  ■ 

n  m  24  /  «a  30  continued  on  next  page 


50 


’  ■  -  4-  4-  4-  4-  4- 

rf . 

4-  4-  -  4-  4-  -  -  - 

4-  4-  4-  •  4-  -  4-  - 

•  4-  *  •  4-  4-  4• 

+  -  -  -  4-  4-  -  - 

+  +  •  -  4*  +  +  4- 

+  ■+  +  +  +*  + 

•  •+  +  +  +•  + 

.  .  .  .  -f  4'  4- 

*++-•--  + 

■  4-  4-  4-  *  4-  4-  4- 

*  4-  4-  *  +  •  *  + 

4-4-4-  •  •  4-  4-  .  - 

+  •  4-  •  •  -4-4- 

.  .++•++• 

+  ■++•  +  * 
4-  4-  4-  4-  4-  *  •  - 

+  ••  +  •••  + 

•  •  a 

+  +  +■  *  +  *  + 

■  ■  ■  4*  ■  •  4-  ■  4- 


n  a  24 


•  4-  4-  -  + 

•  ■  4-  4-  4*  4-  4- 

4*  4*  4-  ■  4* 

4-  4*  4-  4-  "4* 

m  m  m  aja  a  M  a 

4-  4-  -  -  *  +  4- 

+  4-  •  4>  4-  ■  + 

■  •  4-  •  ■  •  4- 

•  •  ■  4*  •  4“ 

■  ■  4*  4-  ■  4-  4* 

+  4-4-4-  •  + 

+  +  •  +  +  4-  4- 

•  4-  •  -  +  4-  • 

+  ■  +  •  +  ■ 

•  +  •+  +  +■ 

4-  •  4*  •  4-  ■  ■ 

4-  -  -  4-  4-  •  • 

*  ■  •  +  •  •  + 

4-  •  •  4-  4-  4- 

■  ++■  +  ■ 

+  •++•  +  • 

■  m  m  m  m  a 

a  .  -f  a  -f  -f  -f 

■f  +  •  4*  •  4* 


/  a  30 


++•+• • • ••+++• ■ •++++ 


4 

- 

4 

ik 

4 

- 

4 

- 

4 

4 

• 

4 

- 

4 

• 

a 

a 

4 

• 

4 

• 

■ 

a 

4 

4 

a 

- 

■ 

4 

m 

• 

a 

4 

4 

4 

a 

4 

a 

4 

4 

4 

■ 

• 

a 

• 

4 

4 

m 

4 

4 

a 

a 

■ 

4 

4 

4 

4 

4 

4 

• 

4 

4 

a 

- 

a 

4 

4 

a 

a 

■ 

• 

.  4 

■ 

• 

- 

m 

4 

4 

4 

a 

• 

• 

a 

4 

a 

• 

4 

4 

4 

4 

a 

4 

4 

a 

a 

4 

4 

- 

4 

4 

4 

4 

a 

4 

4 

a 

4 

a 

4 

4 

4 

4 

■ 

a 

- 

- 

4 

a 

4 

4 

a 

4 

4 

4 

- 

- 

4 

- 

4 

4 

■ 

4 

- 

4 

a 

- 

4 

4 

a 

4 

a 

a 

4 

4 

a 

4 

a 

4 

• 

4 

4 

4 

• 

4 

4 

4 

4 

+ 

a 

• 

4 

4 

4 

4 

4 

4 

4 

4 

4 

M 

a 

4 

4 

• 

*  " 

a 

4 

4 

4 

a 

a 

■ 

- 

• 

a 

4* 

4 

a 

■ 

4 

• 

- 

a 

4 

4 

- 

- 

4 

a 

4 

a 

4 

4 

a 

4 

4 

- 

4 

4 

4 

■ 

4 

4 

4 

■ 

a 

4 

- 

a 

- 

4 

- 

a 

- 

4 

4 

4 

4 

4 

■ 

4 

• 

4 

4 

• 

4 

a 

a 

a 

4 

m 

4 

• 

4 

4 

4 

4 

4 

4 

4 

4 

4 

a 

a 

a 

4 

- 

4 

■ 

4 

4 

a 

4 

4 

4 

4 

4 

4 

4 

• 

■ 

- 

4 

a 

• 

a 

a 

4 

a 

4 

• 

- 

4 

4 

4 

• 

a 

4 

• 

a 

a 

4 

a 

a 

4 

4 

4 

4 

4 

4 

a 

■ 

a 

4 

a 

4 

4 

- 

a 

- 

a 

4 

a 

a 

4 

- 

4 

4 

4 

a 

4 

a 

4 

4 

4 

4 

4 

a 

■ 

4 

a 

4 

4 

a 

• 

4 

• 

n 

-24 

/  B 

32 

continued  on  ntxt  page 

52 


.  4  •  •  4  •  •  4  4 . 4  • 

4  4  .  •  *44444444*  -  4  • 

•  *  *44*4*  ■  4  ■  4  *  4  4 

4  4*4  4*  *  •  *  *4  4  4*  -4 

4 . 4  -  *  4  *  *  4 

•  -  4  4  -444  -  +  •  •  •  -44 

.  -  .  .  *  4*  *  *  -4  4*4  4  4 

444*44*  4*444* 

■  -  4  •  4  •  4  •  4  •  4  *  4  *  *4 

4  *  *  -  *  4  4  4  44444  44 

4  4  4  ■  *4  4  4  4  4*  *4-4 

■  4  4  4  4  ■  +  *  +  4  4  4  4.* 

4  -  4  •  4*  4*  ■  4*  4*  *  *44444 

■  -  4*  -  -  *  4 ,  -  4  *  4  •  4  * 

4  -  444*  4*  44*  44*4* 

.  4*4*4*  *44*  *44*4 

■  o|m  a  It  «  ■  ■  rn  «  *  *  *  -j"  • 

*4*444444*  *4*4*4 

4444*  *  *4*  •  *444* 

4444444*  *4  4*  •  -  44 

•  .  4  4  *  *4*4*44*  *  *4 

a  a|*  «|a  a  a  a  a  •  a^«  a  a 

-  4*4444*  *  *4*4  4* 

n  -  24  /  *  32 


53 


+  +  +■+•  -  •+  +  +•+  + 


.  4  4-  •  4  4  *  4  •  -4*4*44 

444-44444-4-444"  - 

4,44-4444-44-4-444 
-  .  44-4-444-4-  *  4  - 

4  *  -4-44-  *4*  *  *44*4 

•  4  -  •  -  4*44-4-  -  444* 

•  •  4  4  4  4  *  •  •  *4*4*  *  -  4 

a  ^  ■  a  •  a  a  a  a  a|a  a  a  a  a 

■  4  -  4  4  444  -  4  -  4  -  4  •  4  - 

•  -  .  -4-4-444444-4 

4  -  -  4-  •  -  .44444444 

•  •j"  •  ^  •  •  «^a  »  ■  ■  •  a  ^a  a  a|a  a 

•  -.4  4  4*4444  . 4 

.  +  a  4  4  •  •  *44*  •  *44 

444*  ■  a44a44a  -  444 

•  •  ■  4a  "444-44  -  *44 

•  4 . 44  -  -  444- 

4  4  4*4  4*  -  *  -44.-4- 

4-44.4.444.  •  4  •  4  • 

•  •  •  4  •  4  •  .444. 

.  *44*444*  •  4  4  -  4  4  ■ 

4-4"  *44*  ■  "4*4"  *4 

.  4  4*4*  *  "44*  •  •  «  4  •  4 

■  •  44"  "4*4  4*  *44-4* 

n  ■  24  /  *s  34  continued  on  next  page 


54 


+  •  "  4*  *  "  4*  4“  4*  -  4"  +  *  •  4*  4*  ■ 

■  + . +  “  •  4*  •+  +  + 

+  +  +  +  +  +  +  +•  •  +  •+  +  +  +  + 

*  4* . 4-  •  4*  *  +  •  *4* 

+  *  +  ■  •  ■  +  •  *  -  4-  -  •  *  •4-4- 

4*  ■  4-  •  4*  4-  4*  •  *  4-  +  *  4-  •  4-  * 

4-  4-  *  4-  •  *  •  4*  4-  4-  *  *4,4’* 

*  *  ■  •  4-4,4-4*4**4-4,4--  -4-4- 

4-  4-  4-  4*  4*  -H  *  •  4-  •  *  *  *  ■  -  4-  • 

■  "  ■  ^a  ■  •  •  a  aL  ^  a  aj«  «  m  a 

4*  4-  4*  -  *  •  •  4"  4*  4-  4-  4-  4-  ■  ■  *4* 

■  •  •  4*  ■  •  4-  4*  *  ■  •  *  4-  4-  •  4-  « 

*  +  41  +  *  4-  4-  4-  4-  4-  *  4-4*  •  4-4-  ■ 

■  4-  4*  4*  4-  •  4-  *  •  +  •  4*  4*  ■  -  *4* 

■  •  4-  4--4--4-4“4-  -  •  ■  4*  -  4*  4* 

*  *  4-  •  4-  ■  4-  •  +  •  *  •  ■  4-  4*  -  4* 

4-4-  •  4-  4-  4*  t  4*  4*  4-  -  •  •  •  4-  *  4- 

*  •  4-  •  4-  ■  •  4*  4-  4-  -  .  4.  +  4.  4.  . 

*  •  •  4-  4-  •  -  4*  •  4-  4-  *  •  4-  4-  -  4- 

Y  *  ^  •  «|»  »  •  *  •  a  M  a 

4-4--4,4“-4-  -  *  4"  4-  4-  •  4*  ■  4* 

4*4..  .  •  4*  4-  •  4-  •  4-  4*  4-  4-  *  4- 

4-"  ■  •  4-  4*  •  *  •  4-  -  4-  4-  4*  ■  4-  • 

*  4*  4*  *  +  *  *  4-  *  4-  4-  4^  4-  ■  ■  * 

n  -  24  /  »  34 


55 


+  4-  4-  -  4-  * 

*  +  -  -  +  - 

4-  4-  -  4-  -  • 

4-4--4-4--H 
4-  ■  4-  -  -  4- 

*  •  •  *  +  4- 

■  ■  •  m  m 

+  +  +  *  *  + 

4*  4-  4*  *  +  * 

4*  •  ■+  +  + 

4-  +  4-  4-  * 

-  '•  •  +  4-  • 

.  .  +  4-  -  - 

•  +  •  +  •  + 

■  w|w  •  m  m  **|w 

4*  ■  +  4-  +  4- 

.  4- 

4-  •  *  •  *  4* 

•  +  4"  4-  4“ 

4*  •  + 

*  •  4*  *  4“  4" 

4-  4-  -  4-  • 

nm  24 


4*  •  41  *4“ 

4"  4'  *  4- 

•  4-  •  -  4-  - 

•  -  -  -  4-  + 

•  4“  *  4"  4*  4* 

•  •  41  u  *  4* 

4"  ■  4*  4* 

4* . 

•  •  w|w  •  ■ 

4“  •  4-  4- 

•  ■  4-  4*  *  4- 

4-  4-  4-  *  4-4- 

■  •  ■  •  * 

•  •  •\*,4-4- 

4-  4-  +  4-  4-  + 

•  -  *  4*4-4- 

+  •*••  + 

4‘  *  4-  -  4- 

■  4-  4-  4-  •  • 

4-  4-  -  *  •  4- 

4-  •  +  4- 

4-  4-  -  4-  - 

•  4-  4*  4-4-4- 


4-  -  4-  •  4- 

4-  4-  4-  4-  ■  • 

4-  4-  4-  -  -  - 

*  -  4-4-4-* 

4-  4-  4-  4-  -  - 

4-  4-  4-  -  4-  - 


•  -  •  4*  ■  4- 

■  ■  ■  ■  a|u  ■ 

4-4-  -  4-4-4- 

•  +  4*  * 

-  -  4-  4-  -  4- 

4-  4-  •  ■  +  4* 

4*  *  -4-4- 

•  4*  4*  4-  4*  4* 

-  4-  -  4-  ■  + 

*  4-  •  4-  4-  * 

+  -  -  4-  4*  • 

*  -  4-  4-  -  4- 

■|<  ■  ■  ■  ■  ■ 

|  41  • 

*  4*  4*  •  *  4* 

4-  *  -4-4-4- 


/  »*  36  continued  on  next  page 


56 


.  .  .  *44*  •  4  ■  4  “  4  -  4  * 

4  •  “4  4  4*4  4  4*  •  *  ■  4  •  -4 

*  *  4  4  -  4*4  4“  *4  4  4  4  4  4* 

4  4  4  4  4  4  4*  -4  4-4  4  4  4*4 

■  4444*4444*  -  *  *4*4* 

•  m  m  m  m  m  «  ■  "j*  ■  m  •  -  »  -  m 

4*4*  *44*  *  4  4  -  4  4  *  *  4 

4*44*  *  •  ■  4  4  -  ■  *44*4 

.  «  4  *  •  ■  4  *  *4  4  4  4*4* 

4  .  •  -  +  4  +  *  •  4  •  4  *  *4* 

44*44*4444*444*4* 

.  4<  -  *  ■  -4  4  4*4  4  4*  *44* 

*  4  *  *  *4*4*4  4  4*4* 

■  4  •  ■  4  *  ■  •  4  4  ■  4  *  4  4  *  4  • 

■  *  4*  4-  *  •  *4*  •  •  4*  4  ■  •  4  ■  4 

4.4.4444.4.44.4.44 

.  .  4  *4  4  -  4444*  “4*444 

44-4.4.  .  .  .44.  .4444 

44*  *444*  ■  •  *4*  •  *44* 

*  4*444 . 444*44 

*  •  •  44*44*  *4*44*  *44 

4444*  *  -  44*4*44*  •  4  • 

44444*4*4*44*  *  *4*4 

*  44*  -  *4*4*4*  *444*4 

n  -  24  /  »  36 


57 


+  + 


4*  *  •  -  ++  ■  *  *+■ 

+  •  +  + .  +  - 

+  4*  ■  +  +  +  +  +'* 

4,  +  -H  +  +  ■  +  +  + 

•  •  +  *  +  +  4-  + 

■  +  -f  •  ■  •  •  +  + 

+  4-  •  4-  -  4-  -  +  - 

•  -  +  •  +  +  •  ■  - 

■f  4*  4*  •  4-  +  • 

a  >^b  a  *|a  a  a  •  • 

•  ■+  +  +•  +  *  + 

■  •  ■  +  *  +  +  * 

■  4*  ■  +  *  +  *  + 

+  -++••••  + 

.  + 

•  +  • 

•  •  •  "4"  *f*  * 

+  •  +  +  •  •+  +  + 

+++++++• • 

+  +  +•  •  +  *  +  + 

+  +•  •  •  +  •  + 

•  +  •  •  •  +  +  -  + 

.  .  .  +  .  +  +  +  + 

n  »  24  /  a  38 


•f  +  +-  +  -  +  •+  + 

-  +  +  +  -  4*  ■  ■  -f 

+  •  +  +  •  -  +  +  4-  + 

+  ■  +  ■++* 

*  •  +  +  +*++■ 

*  4,4’4,4,“4""4-* 

4*  +  •  4*  +  4-  +  *  4-  * 

4-  -  4-  +  *  ■  ■  *  +  • 

*  •  •  -++■+  +  + 

+  •  •  4* . + 

4*  •  ■  4-  -  4*  ,  »  4*  4“ 

a  a|a  «|b  a  a^a  a|a  a^  a  a  a 

4-4-*4-4,4,4-4--4- 
. 4-  ■  4-  ■  4- 

*  ■  •  4*  4*  -  4*  + 

m  a|a  a  a  a  a  a  a  a 

*  •  4-  *  4“  4*  4"  4*  4* 

a  a  tja  a|a  a  a  a  ^ 

*  4-4-4-  -  4-  -  4-4-*4- 

4*  +  *  4*  •  -H  *  +  ■  4" 

*  +  •  -  -f-  4*  -  4-4- 

4*  4-  •  *  4-  •  -  4-  -  - 

4-  •  4-  ■  -  -  +  +  r  - 

■  4*  •  4*  "  *  4* 

continued  on  next  page 


a  a  4»  a  a  a  a  a  a  ■ 

•  4*  •  4*  -  *  4*  *  4*  • 

+  ■  *  4  •  +  +  + 

4-  4* . 4“4"4‘ 

+  **■*  +  *■*  + 

•  +  +  +  +*  +  ■  *  + 

+  +  +  +-  +  •++• 

+  +■+  +  +■  +  *  + 
a  a  a  «|«  *  «  ■  a 

. 4*  4* 

.  4‘a,*f*4,4*a4>  + 

•  *  "+4-  +  4,  +  4,+ 

+  +■+  +  +  +■ 

+  *  4-  ■++*  +  * 

-  -  4*  +  +  -  +  + 

+  ■  +  •  4*  ■  4*  4*  + 

•  +  4-  •  4-  4-  •  •  4-  4* 

■  •  4-  4*  4*  •  4*  4-  • 

4*  4"  •  4"  4*  •  4* 

«  4-  4-  ■  4-  •  4-  4-  ■  4- 

4-  *  4-  •  -  4-  4-  4-  *  4- 

4-  4-  *  *  4"  *  4"  •  4*  4" 

a  «^a  a|a  a^a  a  a  a  a  a 

4-  -  4-  + . 4- 


4-  +  4-  4-  -  +  4-  -  + 

4*  •  4*  4“  4"  •  +  4" 

a  a  a  •  a  ti  a  >j* 

.  .  -  4-  -  4-  4-  -  - 

4*  ■  4*  ■  4*  4*  ■  4-  4* 

4-  -  •  -  *  4-  4-  4-  ■ 

*  +  4-  -  4-  +  4-  4*  ■ 

-  +  •  4*  4*  •  4* 

.  .  4*  4*  ■  4*  4* 

+  •  4"  4*  4-  4“  4* 

-  4-  4*  *  ■  4-  +  4- 

4-  4-  ■  4-  •  4-  •  4-  • 

-  *  ■  4*  ■  4* 

4*4*  -  •H-4-4‘4’* 

a  a  a|a  a  a  a  a  a 

*  4*  •  4*  *  4*  *  • 

*  4*  4*  *  +  4* 

a^*  «|a  a^  a  a  a  a  a  a 

4*  4*  •  *  4-  4-  4-  4-  4- 

.  .  -  4-  4-  -  -4-4- 

4.  4  4.  .  4.  .  .4 

+  •  +  +  -f  -  aa^a 

■  +  *  4-  •  4*  •  4“ 


n  -  24  /  =  38 


59 


+  + 


+  + 


■  •  4'+-+  +  +  +-  *4'  +  4--++* 

4-  4-4-4-44-444-  4*  *  4* 

4  4*  *  4*  *  •  4*  4-  4*  *  4*  •  +  -  4  4  4  4 

4.4.4..  .4..  .44..  •  *4  4  4* 

.  .  4*  *  4*  4*  -4  -  *4444444“ 

4*  -4*44*4*  *44*4-4*4* 

■  -  *  4*444*4*4*  *444*4 


• 

4 

4 

- 

4 

4 

> 

4 

4 

• 

4 

- 

- 

4 

- 

4 

4 

4 

- 

4 

4 

4 

- 

- 

• 

4 

4 

• 

• 

4 

4 

4 

4 

- 

4 

4 

4 

- 

- 

4 

a 

a 

4 

- 

- 

- 

* 

* 

4 

4 

4 

a 

- 

- 

■ 

« 

■ 

4 

» 

- 

■ 

* 

a 

- 

4 

4 

a 

a 

4 

4 

- 

a 

- 

4 

■ 

4 

4 

• 

• 

4 

4 

4 

• 

4 

4 

4 

a 

a 

4 

4 

a 

« 

» 

- 

- 

• 

4 

4 

4 

• 

4 

a 

4 

4 

- 

a 

- 

- 

4 

• 

4 

- 

4 

4 

4 

4 

4 

4 

- 

* 

• 

• 

• 

- 

4 

4 

- 

4 

• 

4 

4 

4 

4 

4 

- 

- 

4 

• 

4 

- 

4 

4 

- 

4 

4 

* 

4 

- 

- 

- 

4 

- 

- 

4 

4 

• 

4 

- 

4 

4 

- 

4 

- 

- 

4 

- 

• 

- 

4 

- 

- 

- 

- 

• 

4 

4 

- 

- 

- 

4 

- 

- 

• 

- 

• 

- . 

• 

a 

4 

4 

4 

4 

4 

• 

4- 

4 

4 

4 

4 

- 

• 

. 

4 

4 

• 

4 

4 

a 

4 

4 

• 

4 

a 

a 

4 

4 

4 

4 

4 

4 

4 

4 

■ 

• 

• 

4 

4 

4 

■ 

4 

M 

4 

4 

4 

4 

- 

* 

4 

a 

4 

4 

4 

4 

- 

• 

a 

a 

4 

4 

• 

m 

4 

a 

a 

4 

a 

4 

4 

4 

■ 

4 

4 

* 

4 

• 

4 

4 

4 

4 

- 

a 

4 

4 

* 

a 

- 

4 

• 

4 

4 

4 

* 

■ 

- 

4 

- 

- 

- 

4 

4 

- 

4 

• 

- 

- 

4 

- 

4 

4 

- 

- 

4 

4. 

4 

- 

- 

- 

■ 

- 

- 

4 

4 

- 

4 

4 

4 

4 

4 

4 

- 

4 

4 

- 

4 

- 

4 

4 

4 

4 

- 

- 

4 

- 

- 

4 

- 

n  ss  24  f  -  40  continued  on  next  page 


60 


+  + 


*t*  *+  +  +*  *  ■  ■  +  •  '  •  •  4-  +  *  4- 

4-  -  4-  -  -  -  +  •  -  *+  +  +*+  +  +  +  +■ 

•f  *  4-  4-  *  -  4-  -  4-  -  4-  *  -  *  *+  +  +  •+• 

-  +  *  -  4-  4-  4-  4-  *  4-  *  -  -  4-  4-  4-  * 

4-  4-  4-  -  *  *  *  4-  -  4-  +  4-  +  -  4-  4-  -  -  +  + 

-  -  +  •  +  •+  +  +  +  +  +•  -  +  -  +  -  -  - 

+  +  +„.+  ■  +  +  +  +  *  4-  *  “  4-  *  +  •  4-4- 

*  -  •  4-  +  *  ■  4-  4“  *  4-4*4-4-4-4-4*4-4-* 

+  -  +  -+  +  +  +  +-  *  4-  4-  4-  4-  4-  4-  *  ■  + 

*  4-  4-  *  4- . +  •+  +*  *  •  4* 

. 4-4-4-*  *"4-*4**  -  4-4- 

4-  •  -  -4-4-4---I-4--4-4-4--4--  -4- 

.  .  .  .  +  «  +  +  +.  +  ■  *  ■  -  4-  •  4-  4-  4*  4* 

.  .  4*  4*  4*  *  •+.*'•  +  *  •  4-  *  •  •  ■  4* 

■  *  ■  +  '-++ . 4-*  -  -  4-  - 

-  4--4-4-4*4--4--4-4--  •++•  +  •  + 

-  ,  4-  •  4-  -  4-  -  •  4-  4-  •  4-  4-  4-  *  *  4-  *  4-  • 

-  4-  4-  4-  -  -  •  •  +  4-  -  4-  4-  4*  4-  4-  -  *  4- 

4‘4-4-4-4-4--4‘4-4--  -  4-  *  4-  *  4- 

*  •  •  -  +  4-  •  •  •  4-  4-  •  4-  -  •  4-  ■  4-  4- 

*4-4-*  -  4-  4*  •  4*  4*  4-  •  •  -  4-  -  4-  4-  - 

4-  4-  4-  *  4-  4-  •  ■  *  -  4-  4*  4-4-*  *  •  4-  • 

4-  4-  •  4*  4-  *  4-  •  •  4*  *  4-  4-  4*  -  4*  •  4-  *  4* 

n  =  24  /  =  40 


61 


+  +  +'++  •  +  -+  +*  +  -  *  4-  .4-4-4- 

-  ...  .  4- . +  4-  4-  •  +  +  +  4* 

4-  +  *  *  4-  4-  4-  -  •  -  4-  -  4-  4-  +  +  -  ■  *4* 

•  4-  •  4-  •  *  •  4-  4-  ■  4-  *  4-  -  •  *  +  •+  +* 

*♦•  4-  4-  *  4-  •  4-  •  ■  -  4-  4*  4-  4-  4-  -  *  4-  4-  4-  - 

•  4-  4-  4“  +  •  •  •++*  +  •■+  +  +  4*  4-  4-  4-  4- 

.  +  •  *  +  *  +  '  +  *  4*  41  -4  4-  *  •  4-  • 

.  .  .  +  +  •  +  •  +  +  -  *  -f  •  4*  +  *  "T 

•  *  4-  •  -  4-  4*  •  -  4*  4-  4-  4*  4-  *  4-  •  •  +  • 

4*  «  4-4-4-4-4-4-4-.4-4-.4-.4-.4-* 

-  -  4-  4-  -  -  4-  •  -  +  -+  +-  +  +  +  +-  +  • 

+  +  +  +  +  +  •  +  4“*  -  4-  -  -  4-  4*  •  *  4-  -  • 

•  *  +  +•+  +  +•  +  -  -I*.  -  *  •  4-  -  ■  4-  4* 

•  4-  ■  4-  4-  4-  4-  4-  *  •  4-  *  ■  4-  •  4-  4-  4-  -  4- 

4-  ■  4- . 4-  *  -  •  ■  ++-++-  +  . 

4*  4*  •  4*  •  •  4*  *  4-  4- . 4-  *  *4* 

•  «4"4“.4“*4’*4‘,4.,4“4'.  •  4*  *  4-  - 

a  *|»  ■  ■  •  b  a  •  «|»  a  a  a  a  a  a  a 

4-  4-  •  +  *  4*  *  -  4-  *  *  -  4-  4*  •  *  4-  4- 

.  .  .  .  4-  4-  -  4*  4-  -  4-  4-  4-  -  •  •  4-  4-  *  4-  4- 

4“  -  +  4*  *  4-  4-  4-  4-  4-  *  4-  4-  4-  4-  *  *  +  •  4" 

a^*a|aaa|aa  a  a  a  a  *  *f  "I*  *  41  "  •  «  a  -|a  a 

4-.  .  .++.  .  4-  4-  4-  •  •  *  4-  4-  4-  4-  4-  •  • 

4-.4-.4-.  .4.4-4..  *  +  4-  -  *  4-  ■  4-  -  4* 

n  s  24  f  st  42  continued  on  next  page 


*  •  4* . +  4* 

+  +  +  *  -  -  -  +  +  • 

*  -  •  4-  *  4*  *  4"  + 

-  4-  •  4-  *  •  •  +  4-  - 

-  4*  •  +  *  -  4-4-4- 

•  4*  *  -  +  +  +•  +  ■ 

+  •+  +•++•  +  ■ 

“I*  ^ 

•  +  •  4*  4*  ■  •  *  *  4* 

+  -  4*  •  4"  *  4-  4-  • 

4,*4,+  *  +  *4-4-4- 

4*  4"  4-  *  *  4-  *4* 

4*  4-  4-  -  4-  -  4-  -  4-  - 

4-4‘*4,4*4-*4-*4* 

4*  4-  -  4- . 4- 

.++.++.  ■  ■  4* 

4*  -  4*  4-  4*  4-  •  4- 

4-  -  *  •  -  4-  4-  4-  ■  • 

4-  •  •  4-  4-  4-  • 

•  •+•4-4*  -  4*4" 

■  4-  4*  +  *  4-  4  4*  -  - 

.  .  +  .  +  .++.  - 

*  •  *  *  4*  +  +  4*  4*  4* 

4-  •  4-  4-  4-  +  -  4-  4*  * 

n  «  24 


•  4-  -  4-  4-  +  4-  -  +  -  - 

-  4-  *  •  •  4-  -  4*  -  4-  4- 

. *4-  4*  4-  4-  + 

4*  --*,•**-  +  •  + 

-  ■  +  -  +  -4-+-  +  - 

•  w  *  ■ 

.  .  4*  *  •  4*  ■  *4** 

+  +  4*  +  •  +  *  ■  *4* 

■  *  •  +  ***-  +  *4* 

4-+*++*  +  -4* 

4-  4-  4-  •  4-  -  -  - 

4*  •  4*  ■  4-  +  4-  4*  • 

4*  4*  ■  ■  +  4*  *  +  4*  ■, 

•  -  +  4*  •  4*  +  •  •  + 

■  4-  4-  *  •  4*  +  4*  •  4* 

-  4* . +  *  4* 

•  4-  •  4-  +  4-  4-  4-  *  •  * 

4"  •  4-  +  ■  *  4-  *  *4" 

•  *  4-  4-  4-  *  4-  *  •  +  4- 

4**  •  4*  *  4*  +  4*  +  +  * 

4*  4*  4*  4-  +  4*  -  4*  4*  4*  4- 

4*  -  +  -  +-  +  4-H-4-  -  + 

•  4-  4-  4-  4-  •  -  +  •  4*  4* 

/-42 


63 


+  + 


a 

• 

+ 

+ 

+ 

+ 

• 

+ 

• 

4 

4 

4 

* 

m 

4 

+ 

■ 

+ 

- 

• 

+ 

4 

• 

• 

4 

4 

4 

4 

• 

4 

+ 

■ 

+ 

- 

+ 

4 

4 

4 

4 

4 

• 

4 

4 

4 

* 

+ 

+ 

+ 

+ 

■ 

4 

- 

- 

4 

a 

4 

• 

4 

+ 

■ 

- 

+ 

- 

4 

4 

a 

- 

• 

- 

4 

■ 

4 

+ 

+ 

+ 

+ 

+ 

4' 

4 

4 

a 

4 

4 

4 

4. 

•• 

• 

+ 

+ 

+ 

+ 

• 

4 

4 

- 

4 

- 

a 

a 

■ 

4 

■ 

+ 

+ 

• 

- 

+ 

4* 

- 

- 

4 

a 

4 

a 

- 

4 

+ 

+ 

+ 

+ 

- 

4* 

4 

4 

4 

• 

4 

4 

a 

■ 

4 

+ 

- 

+ 

a 

m 

- 

a 

• 

• 

4 

4 

4 

■ 

- 

• 

+ 

+ 

* 

+ 

4 

4 

a 

4 

- 

4 

- 

4. 

a 

• 

• 

• 

+ 

4 

4 

4 

4 

4 

4 

4 

- 

- 

• 

• 

+ 

+ 

• 

4 

— 

4 

- 

— 

4 

a 

* 

4 

n 

• 

• 

+ 

• 

4 

• 

- 

4 

4 

a 

4 

4 

+ 

a 

a 

- 

- 

+ 

- 

- 

- 

- 

4 

a 

+ 

4 

4 

+ 

a 

+ 

+ 

m 

- 

4 

a 

4 

4 

4 

a 

a 

4 

a 

- 

+ 

+ 

4 

m 

4 

a 

4 

• 

4 

4 

a 

+ 

+ 

+ 

• 

m 

4 

• 

a 

4 

4 

■ 

• 

+ 

• 

a 

- 

a 

4 

a 

a 

- 

a 

• 

- 

a 

a 

+ 

+ 

- 

4 

4 

4 

4 

4 

4 

4 

+ 

- 

+ 

+ 

4 

- 

- 

a 

a 

4 

4 

• 

+ 

+ 

m 

m 

- 

■ 

4 

4 

4 

4 

4 

4 

+ 

a 

m 

+ 

• 

4 

• 

- 

4 

■ 

4 

4 

• 

4 

- 

• 

+ 

+ 

■f 

+ 

4 

4 

- 

4 

- 

• 

- 

n 

M 

24 

/ 

-44 

continued 

on  next  page 

64 


*  +  “•••• 
•+  +  +  +  +• 

+  •  +  +  +  ■  + 

. +  + 

+  +  +  -  -  +  • 

+  •  +  •  +  +  •  + 

+  +  *  +  *  +  - 
■  +  +  ■  *  +  + 

-«++*  +  •- 
+  +  •+  +  +  + 

*+  +  +•+  +  + 
+  +  +  +  +-  ■  + 
+  -  *  +  +  ■  + 
.  .  +  .  .  +  +  . 

+  +•++•+  + 

+  -  +  +  *  ■+  + 
+  *  +  •+  +*  + 
.  -f  •  +  *  + 

+  *  4* 

-  •  -  +  +  +  + 

■  *  -  *  +  +  • 

+  +  -  +  +  +  + 

n  *  24  /  =  44 


+  +-++-  -  + 

.  +  *  +  ■  +  + 

•  +  +  -  -  +  - 

•  •  •  +  +  *  + 

+  +  +  ■  -  -  + 

■  •  *  +  +  +  • 

•  +  +  ■  ■+  + 

•+  +  +  +•  • 

+  +  •  +  ■  •  ■ 

+  +•  +  •+  + 

•  +  +  +  -  •  + 

+  •  •  +  *  + 

+  +•++■  . 

•  +  •  +  + 

+  •  4*  ■  ■  + 

+  +  +  +  +  +• 

•  +  +  ■  + 

■  ■  ■  •  a^B  "j*  a^a 

a  a  ■  a  a|a  a 

+  ■  +  •  +  * 

+  ■  +  +  +  +  + 

+  •  +  ■  *  +  + 

-  +  •  +  ■++• 

continued  on  next  page 


65 


•  +  +  -  ■  ■  +  •  *  +  "++• 
+  *  -  •  +  -  -+  +  +  +  +• 

+  +  +  +  +  +  +  +  +  +-  +  “  ■ 

■  •  -  +  +  +*+  +  +•  •  + 

*  «|i  m  *|i  a^a  ■  m  ija  m  m  m  m  a 

*  +  •  + . +  +•• 

+  •+  +  +*  *  -  +  •+  +  +• 
■|»  m  ^  a  ^a  a  ■  a^* 

+  ---•++-  +  "•-+  + 
+  +•+  +  +  +  +•++•+  + 
•  -  +  -  -+  +  +  +  +  +  +-  + 

•  *  +  +  +  ■  •  +  •  +  ■  + 

+  +  •  *  +  ■  *  +.  4*  •  ■  +  •4' 

■  +  *  +  ■+'•  *  +  +  4* 

a  a  ^  H  ■  ■  ■  *«ja«  a  a  a  «|* 

•  *  •  ■  4“  4-  -  4"  -  •  +  •  *  4" 

+  -+  +  +  +-  +  •  •  +  +  +  + 

*  4*  4-  -  4-  4-  4-  -  ■  -  +4-4-4- 

.  .41.  .4.4.44.4. 

44.  .  .44.444.  .4 

a  a  a  a  a«|Ma^a|aa|««|«B|aaj»ipju 

+  +  +  +•  •++•  •  -  •  +  ■ 

4.44.44.  .  .  .4. 

.444.4-.  .  .4  -  -  •  + 

n  -  24  /  *  44 


66 


+  + 


+  +  *  +  "  *  +  *  •  +  *'*  •  + 

+  '  +  "++ . +  *  + 

. +  •++*  +  •  + 

+  •  4*  *  +  +  +  •  +  +  * 

*  -  -  ++  *+  +  +  +-  -  +  + 

*  •  •  4,*+4,*  +  *-f+- 

+  +•  +  ■  +  •+  +•  +  •  -  + 

■f  •  +  +  +  +  *  +  ■+  +* 

•  +  +  +-  +  -  +  ■+  +  +  +• 

-  •  + . +  4*  *  -  *  +  • 

+  *  +  -  *  •  ++*  ■  ■  •  4-  * 

-  4-  •  4*  4*  4*  •  -  4*  4-  -  4-  4-  4*  * 

-■■  +  --  --  +  -  +  -+  +  + 

+  •  •+  +  +  +  +  +  +•  -  -  +  - 

•  +  ••*++ . 4*4" 

*  +  ■  +  *  •  +  *  +  •  +  *  +  * 

+  +  +*  +  ■  +  +  +  +  +•  4-  • 

•  4-  4-  *  *++•  ■  +  -  +  •+  + 

•  •  •  *++•  +  ■  -  4-  4-  4-  4-  4- 

•  4-  •  4-  4-  4-  •  4*  ■  *  *  4*  •  4"  * 

+  *4,  +  +  +  +-  +  ’  +  -4‘4-  + 

+  •  +  +  *  •  +  +  +  •  4-  4*  4*  4* 

•H  +  +  +-  ++ . 4*  • 

4-  4-  4-  4-  *  -  4  •  •  4-  4*  4-  -  4-  4- 

n  m  24  /  *■  46  continued  on  next  page 


*44 . 4  4*4 

4  4  4  •  4*4  4  4*  4  4  4  • 

.  -4  4  4  4  4-  *4  4  4  4  4  4 

. 4-44*4* 

4  -  4  -  ■  4  4  4  4  4  *  -  4  •  ■ 

•  444*4*444*  •  *44 

4*  •  *4  4  4  4*4  4*  -4*4 

■  ■  •  4*  *4*  ‘4*  *444 

444444-4*  •  *44* 

*  44-4-444 . 4 

44*  *444*  -  *4*44* 

-  *  *  4-44*4-44444 

4  4*4*  *444*  -4  4  4- 

■  *4  4  4  4  4*4  4  4* 

■  4*4  4  4*4-  *  -4  4  4- 

4-4-4*  *4444*444 

44*44*4*4444*  4* 

■  •  y  •  •  •  «  «  af*  •  m  m  m 

4  *  *  -  *4*44-4  -  *  *4 

4  4*  •  4  *  *  *  -44*  *  *  * 

■  4*  *  •  4"  •  *  •  •»  •  ■  .  4.  + 

444 . 4444*4 

■  .  +  +  .  +  +  -  +  .  •  ■  a  a 

4*  ■  -  44*4*4*4*  *4 


n  s»  24  /  *  46  continued  on  next  page 


-  .  -j.  •  4  “4 

.  .  .  +  *  4 

•  -  *  4  •  4  4  * 

4  4  4  4  4  4  4  - 

+  +  ■  +  • 

4  ■  4 . 

-4  4  4  4-4- 
4-4444-4 
4  4  4  4  4  4-  - 

4  +  4-  •  4  4-  4  4 

4  4*  4*  -4-4 

.  +  .  -  4  -  4 

•  •  4  -  -  -  4  - 

«  ■  4  4  •  +  4 

4  4-  4  4-  -4 

*  4 . 4 

4  ■  -44-44 

"l"  W  ■  M  *  * 

4  +  *  4  4  4  +  4 

m  a  a  "  "4“  " 

«  «|h  «|>  ■  a  aji 

*  +  *  +  •++• 

4  ••  *  *  4  -  4 

nm  24 


4  *  4  -  -  -  4  - 

444-4*44 

a  a^a  a  ^  •  M  • 

4  4*  •  ■+  +  4* 

. 4  4  4 

4-  4-  4-  -  4-  4-  -  4* 

4  4  4  4  •  4*  4"  • 

-  -4  4  4  4-4 

*  4  *  4  4  ■  4 

■  4 . 4 

4  .  .  4  •  4  4  4 

«|a  a  a  a  a  a  a 

4  *4  4  4* 

4  4  4  4  4-4  4 

•  4  4  *  4 

*  4  4  4  4  *  4  4 

a  a  a  a  ^  a  a  a 

•  ■  4  4  4  •  4  4 

^  a  a  a  a 

a  a  a  a  »ja  a  a 

•4-4-44- 

a  a|»  ^  a  a|»  a  a 

4*4 . 

4  -  -  -  4  4  -  4 

/  ■  46 


+  + 


4  4  -  •  - 

4  +  *  4*  4 

4*444 
4  *  4  4 

4*  ■  4  4*  4 

*  4  -  4 

4  4--* 

-  -  4  +  4 

4  *  4*  4* 

*  4  ■  4* 

4" 

m 

4 

4"  4*  4* 

■  *  *  4*  4 

-  4*  4  4 

.  .  .  .  + 

■  «|a  m  m 

4*  *  4* 

*  m  m  m 

•••44 
4  4  *  •  4 

4*  4-  -  4*  4* 

m  m  •  * 


-  +  * 

-  -  4-  4*  • 

4-4-4--- 
4-4-4*-4- 
-H--4-4- 
+  -4-4-4* 

4-  ...  - 

4-  • 

.  4*  *  4* 

•  4*  •  •  4 

4*  *44 

•  •  4*  4-  4 

4*  4*  4*  4*  • 

4*  - 

•  *4*4 

•  4-  4*  4-  ■ 

■  ■  •  • 

4*  4*  4*  4*  * 

•  ■  4"  •  4* 

•  4*  •  4-  + 

.  4*44 

4  4  4-4 

...  4  4 

.  4  4 


48  continued  on  next  page 


70 


.  +  .  •  4-  - 

+  + . + 

•  *  +  +  +  - 

.  .  .  •  +  +  •  + 

+  +  +  -  +  +  +  + 

+  •  +  *  *  +  * 

■f  *  +  +  •  ■  + 

•  *  •  M  ■  •  ■ 

+  +  +  +*++• 

+  +  +  +  +  •  *  + 

+  •  •  •  •  +  «  + 

■  ■  +  +  +  + 

+  +  -  +  +  +  + 

+  +  *  •  +  •  + 

+  •  *  *  •  +  ■ 

•  ++*++-  + 

.  4  -  4  +  -  +  + 

+  •  -+  +  +  +- 

•+  +  +■  +  ■  + 

•  *  +  +  +“+  + 

.  44.  .  -  4 

■  +  +  •  •  +  + 

m  a  m  ^  a  «  »|» 

4.  .4.  .44 

n  m.2i  f  m  48 


.44.4.4. 
. +  +  + 

•  4  •  -  +  •  +  + 

.  .  44.4. 

+  +  +  +  +•  +  • 

•  +  +  •  ■  •  H* 

4.4.  •  4 

4.44.  .  -4 

.  4  .  .  .444 

•  +  •  +  4*  +  +  + 

4  .  -  4  •  •  +  + 

•  *++•+  +  + 

\4‘  •  4  4  4  *  •  + 

4.4444.  . 

44.  .44. 

44.  .44.4 

44.  .  .44. 

•  •  +  +  +  + 

■  a  a  a|a  a  m  a 

44444  .  44 

44.4.4. 

a  a  a  •  a 

.  .  .  44.4. 

continued  on  next  page 


71 


■  4*  4*  •  +  +  4"  * 

•  •  m  m  m 

•+  +  +  +•  -  + 

+  +  4*  4*  4-  + 

-•++---  + 

•  «  •  •  •  *|»  • 

4*  •  +  +  +  +  +* 

+  •  +  *•••■ 

4-  4-  4-  4-  ■  4-  4-  • 

.  +  ••*  +  ’• 

+  +•++•+  + 

4*  "f  •  •  4-  ■  4* 

•+  +  +•++• 

+  -  •  4*  ■  *  +  + 

*  +  •  *  +  +  + 

•  ■  +  «+  +  +  + 

+  ’+  +  +  +■ 

■  ■  •  4-  4-  •  4-  * 

+  *  +  *  4*  4*  4* 

+  + . 

+  +•+■+■•  -4* 

a  ■  ■  ■  ■  «|»  m 

n  -24 


.  +  .  .  -  +  + 

+  +  + . 

•  4*  +  +  +  4*  •  4- 

+  •-  +  -  +  -* 

4-  *  +  +  *  4* 

•  •  •  a  «j»  <• 

. +  +  4* 

+  -++-••  + 

4*  ■  4  ■  4*  4“ 

• 

*  4*  *  4  4  4  4 

44444444 

*  4  +  4  *  4*  4 

*  4*  4*  ■  4-  4"  4* 

^  ■  a  ■  •  » 

*  ■  4-  4*  4*  4- 

4“  -  +  -  4*4“4”4* 

4*  4*  -  4-  4*  4*  •  4 

4-  4*  *  4*  4*  4-  • 

4-  4-  4*  ■  4-  4-  •  • 

■  -  4  4  4  *  4* 

4  •  4  •  4  •  4  • 

/-48 


72 


AM  EMPIRICAL  STUDY  OP  THE  DISTRIBUTION  AMD  PROPERTIES  OP 
THE  SLOPS  ESTIMATOR  USING  MINIMUM  NORMED  DISTANCE  CRITERION 


Barbara  A.  Wainwright 


Stataaaat  of  tha  Problem 

How  should  ona  estimate  a  llnaar  ralation  batwoan  two 
varlablas?  It  is  common  to  uaa  a  ragraaalon  modal  and 
automatically  apply  tha  ordinary  laaat  squares  mathod  of  astimating 
paramatars.  This  is  somatimas  tha  wrong  modal  and  mathod  and  thus 
ona  should  oonsidar  othar  typas  baoausa  of  tha  variablas  and  tha 
assumptions  in  question.  This  leads  in  turn  to  oonsidar  various 
techniques  of  estimation.  Tha  topio  of  this  paper  is  tha 
estimation  of  linear  structural  relations  whan  there  is  measurement 
error  in  both  tha  dependant  and  independent  variablas.  These 
problems  are  often  referred  to  as  Model  II  regression  problems 
[Graybill,  1961]  or  measurement  error  models  [Fuller,  1987].  There 
are  several  techniques  for  estimating  the  model  parameters. 
However,  the  technique  that  will  be  investigated  is  the  one  that 
minimizes  the  perpendioular  distance  between  the  observed  points 
and  the  estimated  line.  While  these  estimates  have  been  derived, 
there  is  very  little  known  about  the  exact  distribution  of  the 
slope  estimator  and  some  of  its  properties  other  than  consistency, 
some  asymptotic  properties  [Fuller,  1987],  and  some  approximate 
tests  and  confidence  limits  [Creasy,  1956;  Kendall  and  Stuart, 
1973].  This  paper  will  investigate  the  following  properties  of  the 
slope  estimator: 

1.  the  shape  of  the  density  of  this  estimator  for  small  samples, 

2.  the  expected  value, 

3.  the  bias,  and 

4.  the  probability  of  Type  I  errors  for  both  small  and  large 

samples . 


The  Minimum  Norm  Distance  Method  of  Estimation  for  The  Classical 
Errors  in  variables  Case 

There  aro  many  techniques  for  estimating  the  structural 
relation  parameters  but  if  one  assumes  normality  and  uses  maximum 
likelihood  estimation  techniques,  then  unidentif lability  is  an 
issue.  There  are  ways  to  alleviate  this  problam  when  either  a,2, 
the  measurement  error  variance  associated  with  the  y  values,  a}, 
the  measurement  error  variance  associated  with  the  x  values,  or  tne 
ratio  of  the  two  error  variances  (A)  is  known  [Kendall  and  Stuart, 
1973;  Lindley,  1953].  We  will  examine  the  one  in  which  we  know  the 
ratio  of  the  variances  of  the  measurement  errors,  lambda.  This 
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case  is  referred  to  by  Fuller  [1987]  as  the  classical  errors  in 
variables  case .  The  resulting  maximum  likelihood  estimator 
fiStyalJ-y  minimizes  the  weighted  sum  of  the  squared  statistical 
distances  between  the  observed  points  and  the  estimated  line.  For 
the  case  in  which  X  -  1,  it  would  minimize  the  perpendicular 
distances.  This  is  often  referred  to  as  the  "minimum  nor^ 
distance."  The  problem  of  minimizing  the  norm  distance  was 
discussed  as  early  as  1877  by  Adcock,  1879  by  Kummel,  and  1901  by 
Karl  Pearson.  However,  this  approach  is  regularly  attributed 
(especially  in  clinical  chemistry)  to  W.  Edwards  Deming,  who 
reintroduced  it  in  1943  [Cornbleet  and  Gochman,  1979;  Goldschmidt 
et  al.,  1981;  Lloyd,  1978;  Handel,  1964;  Northern,  1981;  Schall  et 
al.,  1980;  Smith  et  al.,  1980;  Vormbrock  and  Helger;  Wakkers 
et.al.,  1975;  Weisbrot,  1985;  Westgard  and  Hunt,  1973;  Zucker, 
1947].  According  to  Mandel  [1964],  Deming  minimized  the  weighted 
sum  of  squares 


s  -  -  xj*  +  (Yi  -  yi)U] 


such  that 


a  a  a  A 

Yi  -  a  ♦  . 


The  resulting  slope  estimator  may  be  expressed  as 


where 

(xrx)a 
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Syy  «  g  ( Vi-? >a 


Sxy  -  g  (Xj-x)  {yry) 

Other  resulting  estimators  are 

A  A 

a  *  y  -  Px 

and 


<*.*  -  ~2  l  S<yi”y>a  ’  p  ti  (yr*}  (Xr^  1* 


If  lambda  is  not  known,  then  assuming  it  is  one  is  possibly 
better  than  ignoring  it  all  together.  However,  Vormbrock  and 
Hslger  suggest  the  use  of  duplicates  for  estimating  lambda  in 
method  comparison  studies  analyzed  with  Darning's  procedure.  For 
this  approach  the  following  sums  of  squares  are  calculated 


0* 


g  <x*n  *  xai2) 


(£  (*ii  *  xi^f 

2n 


e 
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JEJ  {xny a  + 


(g  <^ij +  (yii +  y^) 

— 


.  g  <*ij  -  Xi*> a 

g  (yu  ’  y a 


from  which  the  following  are  computed 


2  XQ* 


(2) 


®y*x 


N 


Oy  -  ♦  B»Q« 

A  « 

<1  +  Xpa)  (2 n  -  2) 


According  to  Feldman  .et  al.[l98l]  no  one  knowa  the  exact 
sampling  distribution  of  p.  It  is  important  to  test  H0:  p  -  1, 
particularly  in  method  comparison  studies,  if  this  is  taken  as  a 
constraint  on  principal  components  and  standard  principal 
components,  it  is  equivalent  to  saying  a2  ■  o2x  when  X  and  Y  are 
dependent.  Morgan  [1939]  transforms  these  measurements  and  then 
opts  for  a  t-test.  Using  a  similar  t-test  with  a  slightly 
different  transformation,  one  can  test  0*/9o  for  any  value  of  /0O. 
Confidence  intervals  can  also  be  constructed.  However,  Morgan's 
test  does  not  apply  if  the  above  constraint  can  not  be  imposed. 
Kendall  and  Stuart  [1973]  as  well  as  Creasy  [1956]  give 
confidence  limits  and  tests  of  hypotheses  for  the  case  1-1 
using  the  fact  that  p  -  tan  $,  and  the  fact  that  the  sample 
correlation  coefficient,  r,  has  a  "Student's  -  t"  distribution 
with  (n-2)  degrees  of  freedom  when  y  is  normally  distributed  and 
the  correlation  coefficient,  p,  is  zero. 
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From  the  literature,  it  does  not  appear  that  the  expected  value 
of  p  is  known;  especially  since  the  sampling  distribution  is  not 
known.  Fuller  [1987],  in  his  exposition  of  the  asymptotic 
properties,  claims  that  p-p  -  0  (n’1).  He  also  gives  the  variances 
of  the  limiting  distribution.  These  are 


A 


a2  s  +  ad2s  ~  p2od4 
(n-l)  o* 


(3) 


where 


s  • 


<n-i)(-£  +  pa)0<fa 


in-2) 


0* 


mn  + 


and 


Sampling  Distribution  of  $ 

Although  the  exact  sampling  distribution  of  $  is  complex  and 
there  does  not  seem  to  be  a  general,  closed  form  solution,  by 
examining  various  expressions  of  p,  and  by  using  various 
transformations  of  variables  and  Mellin  transforms,  we  can  obtain 
expressions  for  the  density  of  p  for  some  special  cases.  For  small 
samples  under  some  of  these  situations,  we  can  show  that  the 
density  is  far  from  being  a  t  distribution. 
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Special  cimi  n  -  2 

To  investigate  tha  density  of  the  slope  estimator,  let  us 
consider  the  simplest  case  in  which  n  -  2.  When  n  -  2,  the  line  is 
uniquely  determined  regardless  of  the  intended  method  of 
estimation.  Tha  density  of  the  estimator  will,  however,  depend  upon 
tha  conditions  imposed  and  the  assumptions  made.  First,  we  will 
assume  that  the  relation  between  X  and  Y  is  given  by 


Y  -  0X.  (4) 


That  is,  we  will  assume  that  the  intercept,  a,  is  zero. 


Case  it  X,  Measured  Without  Error 

For  the  first  case  we  will  assume  that  the  relation  above 
holds  and  that  the  X,  are  fixed  or  predetermined  values  of  the 
random  variable  x,.  Tnese  values  are  measured  without  error.  This 
is  therefore  an  example  of  a  regression  situation.  We  observe 


yi  -  *i  + 


where 


e  ~  N(0,  o#a) 


and  thus 


yrm  p^,0.a) 


The  slope  of  the  line  in  this  case  is  given  by 


:  -  yu-2. i  .  xi 

P  x2  -  X  * 
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In  this  case 


y*  -  y2  -  yx  -  tf(P*a  -  P*i  >  2 O 


and 


Jf'  -  Xa  -  ^ 


ia  constant.  Thus 


20.a 


(X,  ~  Xx) 


Not*  that  in  th*  uaual  cas*  of  l*a*t  squares, 


Vaz 


x)a 


which  i*  th*  *am*. 


Cas*  21  Xi  a  Normally  Distributed  Random  Variabl* 

Lot  us  now  assum*  that  th*  r*lation  defined  is  a  structural 
on*  b*tw**n  unobservable  random  variables.  That  is  X,  is  an 
unobservable  random  variable.  Assume  the  following: 


Xi  -  W(n,o3) 


where  we  observe 


xx  ■  +  dx  ,  di  -  N{ 0,  oda) 
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Vi  "  *i  ♦ 


where 


0£  -  AT(0,o#a). 


Thus  the  following  distributional  thaory  exists. 

x*  ~  (|i,  oa  +  orfa) 


yi  -  tf(pn,  p2oa  +  o#a)  . 


For  the  case  in  which  n  -  2 , 


:  .  hiii  .  z: 

P  xa  -  xa  X* 


where 


y*  -  tf(0 , 2  (Paoa  +  o,a)) 


X*  -  N(0 , 2  (0a  +  oda)) 


In  this  case  $  is  a  ratio  of  normal  random  variables,,  bach  having 
mean  zero.  One  could  attempt  to  obtain  the  density  of  p  by  changing 
variables  and  by  using  the  moment  generating  function.  However, 
Maple  and  Derive  could  not  evaluate  these  integrals.  Through  Mellin 
transforms,  Springer  [1979]  derives  the  density  for  the  ratio  of 
two  dependent  standard  normals  (p.  156)  and  Craig  [1942]  derives 
the  density  for  the  ratio  of  two  dependent  normal  random  variables 
each  with  mean  zero  and  any  finite  variance,  'faking  the  bivariate 
normal  density  of  (x*,  y*) ,  making  a  change  of  variables,  and  using 
the  Mellin  transform  for  two  dependent  random  variables  [Craig, 
1942;  Springer,  1979],  it  is  determined  that  the  density  of  p  for 
this  case  is  given  by 


80 


It  is  also  worth  noting  that  if  x  and  y  obay  a  normal  bivariats 
probability  dsnsity  function,  ths  maan  valua  of  y/x  doas  not  axist. 

Casa  3t  Xi  in  Unobssrvabla  Mathematical  Variabla 

•  \ 

If  X,  is  a  rohthamatical  variabla,  than  tha  ralation  Y»(3X  is 
a  functional  ralation.  If  wa  obsarva 


Xi  ■  Xi  +  di ,  di  -  N( 0,  oda) 


than 


yx  «  JJX*  *  9i 


whara 


~  N(O,o0%) 


and 


Yi  -  tf<p Xito,*)  . 


Tha  slops  astimator  is  computad  in  tha  usual  mannsr,  but  in  this 

oasa 


y*  ■  y2  -  yi  -  w<px2  -  px^o.2) 
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x*  -  xa  -  Xj  -  N(X2  -  Xx ,  2ad *)  , 

Among  the  various  substitutions  or  change  of  variables  that 
Craig  [1942]  uses  is  that 


z 


For  the  case  of  a  ratio  of  two  dependant  normal  random  variables 
with  nonzero  means,  Craig  [1942]  provides  an  expression  for  the 
density  of  w  as  follows: 


£(w) 


exp 


— • — i — -r  Ui*  “  2r.ru  + 

2(1  -  pa)  1  1  “ 


2  it  y^l  -  pa 


+  Jbuj  |  u  |  du 


where 


a 


1  -  2pwa 
(1  -  pa) 


0, 


pa  <  1 


b  «  ri  "  Pri  *  :  P*i>  w 

(1  -  pa) 


According  to  Craig, "this  can  be  calculated  from  existing  tables  for 
particular  values  of  w  and  of  the  parameters."  No  closed  form 
solution  seems  to  exist  particularly  for  z. 

All  of  this  theory  above  still  applies  for  the  case  in  which 
Y  -  a  +  0X.  The  means  in  each  case  will  be  identical  since  a  will 
merely  subtract  out.  Therefore  the  distributions  derived  for  each 
of  the  three  cases  will  be  the  same  regardless  of  the  value  of  a. 
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General  casei  Arbitrary  n 

Various  transformations  of  variables  were  made  in  an  attempt 
to  derive  the  density  of  p.  Without  loss  of  generality  we  can 
assume  that  the  means  are  zero  and  thus  we  can  use  the  sums  of 
squares  and  orossproducts  rather  than  the  sums  of  squared 
deviations  from  the  mean.  It  is  still  the  case  that  when  1  -  l, 


P  ■  w  +  +  l  (5) 


where 


w 


a  -  b 
2  C~ 


but  now 

a  ’  tiy ' 


b 


c  ■ 


We  hoped  to  obtain  the  density  of  w  through  an  appropriate 
transformation  or  through  Mali in  transforms,  once  wo  had  the 
density  of  w,  then  another  change  of  .variable  based  on  Equation  5 
could  possibly  yield  the  density  of  p.  It  did  not. 
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Expected  Value  of  p  and  Analysis  of  tha  Bias 

Through  Taylor  sarias  axpansions  an  approximate  expression  is 
obtained  for  the  asymptotic  expected  value  of  ft.  For  large  n, 
under  tha  assumption  that  X  ■  1,  we  may  conclude  that 


+  l 


Recall  that  X«1  implies  o2t  *  c#‘  and  tha  above  expression  reduoes 
to  0  which  is  expected  of  a  consistent  estimator.  However,  it  can 
be  shown  that  use  of  an  incorrect  value  of  X  can  introduce  an 
additional  bias  that  does  not  approach  zero  in  the  limit  as  n  goes 
to  infinity.  Note  that  when  X  #  1;  a  similar  expression  results 
and  the  same  situation  exists. 


Simulation  Results 
Large  Samples 

Computer  simulations  are  performed  using  SPlus.  Cases  are 
considered  for  which  X  is  a  fixed  vector  with  measurement  error  and 
for  X  a  random  variable.  For  each  of  these  cases  various  values  of 
the  parameters  and  various  sample  sizes  are  considered.  These 
simulations  sssiil  to  support  the  theories.  For  large  samples  the 
distribution  of  p  appears  normal  in  most  cases.  However  the  use  of 
an  incorrect  value  of  X  does  introduce  bias  as  Figures  1  through  4 
indicate.  Note  that  underestimating  X  results  in  underestimation 
of  p  on  the  average  and  vice  versa. 

Table  1  provides  the  mean,  variance,  upper  and  lower  tail 
probability  of  rejection  of  Hi  p-l  for  various  cases.  It  should 
be  noted  that  for  n-100  ana  /S«1  most  simulations  result  in  a 
density  of  p  that  is  very  close  to  a  normal  density.  However,  in 
method  comparison  and  bioequivalence  studies,  we  often  assume  that 
X-l  if  the  error  variances  are  unknown.  Doing  so  can  shift  the 
density  to  the  left  or  right.  The  shift  or  biasing  effect  can 
greatly  increase  the  chance  of  making  a  Type  I  error  in  testing  H0: 
Pm  1.  It  should  be  noted  that  many  other  simulations  were  performed 
with  similar  results. 


Smeller  Sample  Simulations 

In  smaller  samples  asymptotics  do  not  always  hold  and  in  fact, 
extreme  values  of  the  statistic  often  result.  Since  samples  of 
sizes  24,  36,  and  48  are  commonly  used  for  bioequivalonce  studies 
[Snikeris,  1992],  these  are  considered  along  with  still  other 
sample  sizes,  only  some  of  which  will  be  addressed  here. 

Very  skewed  densities  often  result,  while  in  other  cases  the 
densities  are  fairly  symmetric.  Table  2  shows  the  results  of 
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Dens 


Figure  1:  n=10Q,  /J=l,  X  ~  N(5,4),  d  ~  N(l 


Beta-hat 


Dens 


Figure  4:  n=100,  0=1,  X=(1:I0,10),  d 


Table  1: 


X 


S i mulation  Results:  n  =  100,  H0 


-.0=1. 


upper 

prob 


Ha-.  0  41 


total  shape 


.0625  .25 


.0625  .25 


.25  .25 


.25  .25 


.25 

.25 

LZ3 

1:10,10 


1:10,10 


N(  5.1 ) 


N(5,4) 


N(5,4) 


N  5.4) 


N  ( 5.4) 


N(5,4) 


N(5,16) 


1:10.10 


1:10.10 


1:10.10 


1:10.10 


1:10.10 


N  3.1) 


N(  3.1 ) 


N(3,l) 


1:10.10 


1:10.10 


1.0007 

.0010 

.0190 

.0260 

.0450 

symmetric 

1.0157 

.0010 

.0057 

.0750 

.0807 

symmetric 

1.0130 

.0356 

.0080 

.0397 

.0047 

slight  skew 

1.0010 

.0034 

.0160 

.0320 

.0480 

symmetric 

.9105 

.0029 

.4003 

.0003 

.4006 

symmetric 

.9999 

.0230 

.0280 

.0510 

symmetric 

.9764 

.0008 

.1383 

.0037 

.1420 

symmetric 

.9995 

.0013 

.0230 

.0270 

.0500 

.9996 

.0003 

.0250 

.0230 

.0480 

symmetric 

1.0008 

.0016 

.0230 

.0270 

.0500 

symmetric 

1.0450 

.0017 

.0013 

.0647 

.0660 

symmetric 

1.2400 

.0067 

.0000 

.8600 

.8600 

symmetric 

.9982 

.0054 

.0230 

.0270 

.0500 

symmetric 

1.4585 

.0071 

1.0000 

symmetric 

5. 2657 

2.3979 

.0000 

.7273 

.7273 

skewed 

4.2460 

1.6350 

.0000 

.7237 

.7237 

skewed 

.2505 

.0034 

1.0000 

.0000 

1.0000 

symmetric 

L.0080 

.0128 

.0150 

.0350 

.0500 

symmetric 

,9910 

.0009 

.0520 

.0110 

llil’.Mil 

symmetric 

1.0040 

.0009 

.0210 

.0320 

.0530 

symmetric 

several  simulations  when  n  -  24.  Although  one  hopes  that  the  error 
variances  are  not  larger  than  the  variance  of  X  here,  a  few 
simulations  indicate  the  general  trend  of  how  these  variances 
greatly  affect  the  density  of  p.  From  the  table  we  can  sea  that 
whenever  a  is  smaller  than  one  or  both  of  the  error  variances  (in 
the  case  p  -  1  it  does  not  matter)  very  extreme  estimates  result 
and  thus  the  variance  of  p  is  very  large  (larger  than  it  should  be 
according  to  Fuller) .  In  fact  the  sampling  distribution  of  p  is 
very  skewed  to  the  left  or  right,  as  Figures  5  and  6  show. 

For  smaller  error  variances  the  sample  correlation  tends  to  be 
larger,  and  therefore  we  do  not  observe  as  many  large  values  for 
the  estimator.  Figure  7  displays  a  more  stable  density.  This 
difference  is  also  evident  in  the  probabilities  of  Table  2. 
Simulation  results  for  n  -  36  and  n  •*  48  appear  in  Tables  3  and  4 
respectively.  Comparing  these  tables,  it  is  plain  to  see  that  for 
any  given  situation,  the  mean  is  closer  to  the  true  value  as  the 
sample  size  increases  from  24  to  36  to  48.  Notice  that  the 
variances  become  smaller  as  we  would  expect.  In  many  cases 
densities  skewed  for  n  ■  24  become  less  skewed  for  n  -  36,  and 
still  less  for  n  -  48. 
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H 

H 

X 

| 

ESI 

mean 

4 

var 

4 

lower 

prob 

upper 

prob 

total 

shape 

.25 

.25 

N  (3, .25) 

1 

1.2360 

11.6920 

skew  left 

.25 

.25 

1 

1.0390 

.0833 

.0013 

.0367 

.0380 

iraa 

.25 

.25 

N  (3,1) 

1 

1.0156 

.0303 

.0053 

.0340 

.0393 

symmetric  | 

.25 

.25 

N  (3.1) 

1 

1.0570 

.0206 

.0050 

.0430 

.0480 

symmetric 

.25 

.25 

KlfcEJM 

1 

1.0270 

.0123 

”0100 

.0360 

.0460 

BEsmsai 

1 

1 

N  (3.1) 

1 

•33.0430 

.0000 

BSllCTMlI 

i  1 

.25 

N  (3.1) 

.25 

1.0142 

.0774 

.0117 

.0303 

.0420 

symmetric  j 

i 

.25 

N  (3,1) 

1 

1.5532 

.64G6 

.0343 

.0350 

e-mail 

i 

.25 

N  (3,4) 

.25 

1.0045 

.0160 

.0150 

.0240 

.0300 

symmetric  | 

l 

.25 

N  (3,4)  |  1 

1.1080 

.0108 

.0000 

.0060 

.0960 

skew  left  | 

.25 

1 

N  (3,1)  1 

.7085 

.0428 

.2367 

.0040 

.2407  |  skew  left  | 

.25 

1 

1.0552 

1.1002 

.0003 

.0030 

.0033  1  skew  lett  | 

.25 

1 

N  (3.4)  |  1  .0180  1  .0145 

.0637 

.0003 

.0730  I  symmetric  I 

■a 

l 

jKltAlUUIAiUIlHftMl 

.0030 

.0387 

.0417  I  jymmetnc  | 

LA 

.5 

N  (3,1)  |  2 

.0003 

.0370 

.0373 

skew  rt.  | 

B 

.5 

N  (3.1)  |  1 

.0020 

.0367 

.0163 

.0530  1  symmetric  1 

.25 

.3 

N  (3,4)  1  2 

1.0061 

.0009 

.0113 

.0300 

.0413 

symmetric  1 

IEH 

N  (3.4)  I  1 

.0734 

.0008 

.0203 

.0150 

.0443 

symmetric  | 

.5 

.25 

N  (3, .23)  1  1 

1.(1314 

2428.0910 

TTiMl 

.0030 

.0031 

skow  left 

.5 

.25 

1.1049 

7.6676 

.0027 

.0060 

.0087 

sym  peak 

.5 

.25 

N  (3,1) 

1  1 

1.1657 

.0676 

.0003 

.0803 

.0806 

symmetric  1 

\m 

.25 

N  (3.1) 

!  .3 

1.0162  .0468 

.0083 

.0310 

.0303 

symmetric  | 

ira 

.25 

mmm 

mm 

1.0016  )  .0004 

,0147 

■rarei 

.0420 

symmetric  | 

IKl 

.25 

mmm 

mm 

mammmm 

.0037 

.0517 

.0554 

symmetric  1 

La. 

1 

N  (3.1) 

i 

ITTttElB  —TO1 

.0090 

.0117 

“ToSo if 

sym  peak 

iim 

1 

^  (3,1) 

2  1.1207  2.9510 

iKMl 

.0053 

.0060 

skew  rt.  | 

* 
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Density  of  Beta-hat 
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Figure  5  :  u=24,  0=1,  X  ~  N(3,l),  d  ~  N(0,1),  e  ~  N(0,.25),  mean=  1.0552,  variance=1.1092,  A= 


Density  of  Beta-hat 


Dens 


Figure  7  :  u=24.  /)=!,  X  ~  N(3,l),  d  ~  N(0,. 25),  e  ~  N(0,.5)  nie»n=1.0162,  variance. 


Table  3  ;  Simulation  Results:  n  a  36,  Hq  :  0  w  1,  H a  :  0  ^  1 


X 

mean 

4 

var 

4 

lower 

prob 

upper 

prob 

total 

shape 

.25 

.25 

N  (3, .25) 

1 

1.0690 

.2047 

.0010 

.0290 

.0300 

sym  peak 

.25 

.25 

N  (3,1) 

1 

1.0110 

.0175 

.0100 

.0370 

.0470 

1 

1 

N  (3,1) 

1 

1.0690 

.2047 

.0010 

.0290 

.0300 

mmnmm 

1 

1 

N  (3, .25) 

1 

.4436 

3318.9860 

.0030 

.0020 

.0050 

peak 

1 

.25 

N  (3.1) 

.25 

1.0090 

.0483 

.0160 

.0310 

.0470 

symmetric 

1 

.25 

N  (3.1) 

1 

1.4900 

.1295 

HUE] 

.2000 

.2000 

slight  skew 

1 

.25 

N  (3.4) 

1 

1.1046 

.0124 

.1313 

.1316 

symmetric 

1 

.25 

N  (3,4) 

.25 

1.0040 

.0106 

.0190 

.0200 

.0480 

symmetric 

,25 

B 

N  (3.1) 

1 

.7063 

.0237 

.4690 

.0010 

.25 

1 

N  (3,1) 

4 

1.0470 

.0063 

MQ 

.0460 

.0460 

skewed 

.25 

L 

N  (3.4) 

1 

.0149 

.0082 

.1293 

.0053 

.1346 

.25 

MW 

N  (3.4) 

4 

1.0084 

.0106 

.0080 

.0383 

.0463 

,5 

.25 

N  (3. .25) 

i 

1.8740 

21.0020 

.0060 

.0070 

.0130 

,5 

.25 

X  (3, .25) 

W 

1.0970 

1.4640 

.0013 

mmm 

.0080 

■ebbh 

.5 

mm 

1.0027 

.0060 

.0147 

.0200 

.0437 

,5 

.25 

N  (3.4) 

i 

1.0341 

.0066 

.0037 

.0560 

.0597 

symmetric  | 

IB 

.25 

KlCTf 

.5 

1.0120 

,0271 

.0103 

.0363 

.0466 

wnrornnwil 

m 

.25 

N  (3.1) 

l 

1.1550 

BESS! 

.0007 

.1000 

.1007 

symmetric 

.25 

B 

N  (3,4) 

2 

1.0018 

.0065 

.0110 

.033 ' 

.0443 

symmetric 

.25 

B 

N  (3.4) 

1 

.9729 

.0060 

.0370 

.01  i 

.0513 

^FRtT7T|TlflHI 

1 

.25 

IU 

.25 

1.1001 

1.9390 

.0030 

.0083 

.0113 

sym  peak 

1 

.25 

1 

2.6050 

3979.8870 

.0030 

.0020 

.0050 

peak  skew  left 
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Table  4  :  Simulation  Results:  n  «  48  Ho  :  9 


lower 

prob 


.0020 


.0140 


.0047 


.0020 


.0140 


.0020 


.0017 


.0143 


.0000 


.0180 


.0013 


.0210 


.6503 


.0010 


.1700 


.0057 


.0007 


N  (3, .25) 


N  (3,1) 


N  (3, .25) 


N  (3,1) 


N  (3, .25) 


N  (3, .25) 


N  (3,1) 


N  (3,1) 


N  (3,4) 


1:8.6 


1:8.6 


(3.1) 


N  (3.1) 


N  (3.4) 


N  (3.4) 


N  (3. .25) 


N  (3. .25) 


N  (3.4) 


N  (3.4) 


N  (3.4) 


1.0406 


1.0060 


.7050 


1.0406 


1.0056 


2.0860 


I  1.0470 


I  1.0070 


1.4800 


I  1.0010 


I  1.1028 


1.0720 


1.0016 


I  .6970 


I  1.0270 


I  .9132 


I  1.0883 


I  1.7510 


I  1.0344 


I  .9093 


I  1.0348 


I  1.0033 


.9697 


.1014 


.0125 


205.8400 


.1014 


.0125 


1735.5600 


.8793 


.0340 


.0875 


.0072 


.0090 


.0056 


.0051 


.0171 


.0397 


.0062 


.0073 


134.8598 


upper 

prob 

total 

shape 

.0330 

.0350 

skew  left 

.0300 

.0440 

symmetric 

.0027 

.0074 

skew  left 

.0330 

.0350 

skew  right 

.0300 

.0440 

EEGBBE1 

.0020 

.0040 

skew  left 

.0110 

.0217 

skew  left 

.0257 

.0400 

symmetric 

IHU 

skew  right 

.0210 

.0390 

.1670  |  .1670 


.1530  |  .0543 


.0003 


.0450 


.0030 


.0340 


.0047 


.0042 


.0048 


.0044 


.0040 


.0153 


.0033 


.0100 


,0553 


.0263 


.0313 


.0093 


.6506 


.0460 


.1730 


.0397 


.0054 


.0400 


.0416 


.0690 


.0413 


.0646 


symmetric 


symmetric 


symmetric 


skew  left 


EBU 


symmetric 


skew  left 


skew  right 


mmai 


symmetric 


symmetric 


skew  right 


We  can  see  from  the  simulation  results  that  large  values  of  0 
often  result  even  when  the  true  value  of  the  parameter  is  one. 
Recalling  the  expression  for  p,  we  can  see  that  a  low  covariance 
(or  correlation  coefficient  in  the  case  of  normality)  will  result 
in  a  large  slope  estimate.  This  low  sample  correlation  often 
results  from  introducing  large  error  variances.  Another  objective 
becomes  developing  a  rule  based  on  checking  the  sample  correlation, 
and  determining  what  to  do  with  these  extremes.  An  empirical  rule 
was  developed  to  detect  and  test  for  extremes.  Once  an  estimate  is 
detected  as  extreme,  the  correlation  coefficient  is  tested  for 
significance.  If  the  correlation  coefficient  is  not  significantly 
different  from  zero,  then  the  low  correlation  is  the  cause  for  the 
extreme  estimate  and  therefore  it  is  an  unreliable  estimate  of  0. 
Now  the  problem  becomes  what  to  consider  extreme.  The  objective  is 
to  screen  those  large  estimates  that  are  due  to  low  correlation. 
Therefore  we  do  not  want  to  detect  as  extreme  an  estimate  that  is 
large  because  0  is  large.  Since  we  would  probably  not  know  the 
actual  variances  in  a  single  sample  problem,  a  conservative 
estimate  of  variance  is  needed  to  screen  an  estimate  for 
extremeness.  It  is  generally  the  case  that  the  measurement  error 
variances  will  be  smaller  than  the  variance  of  X.  When  testing  H  : 
pm  1,  the  maximum  variance  is  attained  when  ffz“Cjzd-o2t .  When  this  is 
the  case  a  good  approximation  of  the  variance  is  3/n,  After  muoh 
screening  using  various  estimates,  it  seems  that  what  works  best 
for  detecting  most  of  the  extremes  that  resulted  from  low 
correlation  is  ±  sj (3/n)  .  Thus  any  estimate  that  is  more  than  five 
standard  deviations  from  the  hypothesized  value  of  p  will  be  tested 
for  significant  correlation. 

In  general,  if  it  is  believed  that 


°#a  “  ■  ^°a 


then  the  estimate  of  variance  is  given  by 


2k  +  k 1 
n 


If  0*1,  then  the  estimate  of  variance  is 

(P2  -■  1)  k  +  k2 
n 


Table  5  provides  various  simulations  when  n  -  8  and  the 
screening  rule  is  used,  one  can  see  that  in  practically  all  cases 
100'.:  of  those  detected  as  extreme  were  due  to  low  correlation. 
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Table  5  :  Simulation  Results:  n  =  8,  Truncation  Results  Using 


T  ' 

n 

X 

A 

used 

Fraction  deleted  due 
to  low  correlation 

.25 

.25 

N  (3.1) 

1 

xla 

11^ 

O 

CD 

O 

o 

l 

1 

N  (3.1) 

1 

$$  =  1.0000 

1 

.25 

N  (3.1) 

.25 

_ 1 

4  =  1.0000 

1 

.25 

N  (3,1) 

1 

.  9864 

1 

.25 

N  (3,4) 

.25 

•=1.0000 

1 

.25 

N  (3.4) 

1 

\ 

•  =  1.0000 

.25 

1 

N  (3.1) 

1 

$  =  1.0000 

.25 

1 

N  (3.1) 

4 

"1@-i.oooo 

.25 

1 

N  (3.4) 

1 

_ 5 

=  1.0000 

.25 

1 

N  (3.4) 

4 

4=1.0000 

.25 

,5 

N  (3.1) 

1 

4  =  1.0000 

.25 

.5 

N  (3.1) 

2 

8 =1.0000  i 

.25 

.5 

N  (3,4) 

2 

$=i.oooo 

.5 

,25 

N  (3.1) 

l 

~ ”  JW  =  1.0000  1 

.5 

.25 

N  (3,1) 

.5 

’'“i =i.oo()()  “ f 

.5 

,25 

N  (3,4) 

1 

1.1)000 

.5 

,25 

N  (3.2) 

l 

4  =  ,0474 

.5 

,25 

N  (3.2) 

.5 

}  =  1,0000 

application 

We  have  the  opportunity  to  analyze  some  real  data  using  this 
estimation  technique.  The  data  come  from  two  systems,  called  A  and 
B,  being  compared  for  equivalence  in  handling  specimens.  Each 
system  analyzed  one  hundred  specimens,  not  once  but  twice 
consecutively.  Therefore  wo  have  replicates  for  estimating  lambda. 
Figure  8  gives  a  plot  of  system  A  measurements  versus  system  B 
measurements.  It  shows  a  strong  linear  trend.  We  are  interested, 
however,  in  whether  the  slope  is  significantly  different  from  one, 
suggesting  that  the  systems  are  not  equivalent. 

Analysis  of  original  Data 

Normal  plots  for  x  and  y  indicate  that  the  densities  of  x  and 
y  do  not  differ  drastically  from  a  normal  one,  and  since  n-200,  we 
can  assume  asymptotics  hold.  We  will  analyze  the  two  hundred  pairs 
assuming  that  the  ratio  of  error  variances  is  one.  The  analysis 
results  in 
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p 


1.140371 


A 

V[p]  -  .00627 

so 

Z  «  1.7724. 


This  does  not  suggest  that  we  should  reject  the  hypothesis  0  -  1. 
As  we  saw  from  the  simulations,  we  will  have  large  probabilities  of 
Type  1  errors  if  we  assume  that  X  -  1  when  in  fact  it  is  not.  Hera 
we  do  not  have  to  worry  about  a  Type  I  error,  but  if 
underestimation  occurs  because  we  let  X  ■  1,  we  could  be  making  a 
Type  II  error.  Therefore,  lambda  will  be  estimated  using  Vormbrock 
and  Helger's  method  of  duplicates  and  we  will  use  this  estimate  in 
computing  0.  Using  the  duplicates  to  estimate  the  error  variances, 
we  find  that 


a/  *  1.840312 


a02  -  .463623 


so 


X  -  3.9694. 


Using  this  estimated  ratio  in  the  computation  of  f?  yields 

a 

P  -  1.149295 


99 


and 


z  ■  1.88 


These  results  do  not  differ  much  from  the  previous  ones,  but  when 
n  ■  200  we  would  expect  the  density  of  p  to  be  approximately  normal 
and  rather  stable,  particularly  when  the  varianoe  of  X  is  large 
relative  to  the  measurement  error  variances.  The  sample  variances 
of  x  and  y  are  106.5541  and  138.0568,  respectively.  Recall  that  a  2 
-  a2  +  c2  and  so  or2  »  106.5541  -  1.840312  -  104.71379.  This  is  very 
large  relative  to  the  measurement  er^or  variance.  As  we  saw  from 
the  theory,  the  expected  value  of  p  will  shift  when  X  -  1  is 
incorrectly  assumed.  We  have  only  one  estimate,  yet  we  still  can 
see  that  this  second  estimate  is  slightly  larger  and  this  suggests 
that  when  a2  <  o2d  and  X  ■  1  is  used,  there  is  underestimation  on 
average.  This  may  suggest  in  turn,  that  the  true  value  of  (3  is 
closer  to  the  second  estimate.  However,  we  do  not  have  a 
probabilistic  statement  of  this  fact  since  we  are  using  only  an 
estimate  of  X  and  we  are  comparing  only  one  estimate  of  p  obtained 
by  each  approach.  Although  we  have  not  studied  the  effects  of  a  on 
the  density  of  p  in  this  paper,  it  seems  better  to  estimate  lambda 
than  to  assume  it  is  one.  Therefore,  when  we  select  only  one  large 
sample  and  estimate  /3,  we  should  use  an  estimate  of  X  rather  than 
an  assumption  that  X  *  l. 

If  we  carry  out  a  least  squares  analysis  on  the  entiro  set  of 
observations,  the  results  are  as  follows: 


P  ■  1.12217 


Sb  •  .013555 


and 


Z  •  9.01. 


The  standard  error  of  this  estimate  is  smaller  than  the  standard 
error  of  the  estimate  obtained  from  the  norm  distance  technique. 
The  smaller  variance  (.00018)  for  least  squares  results  from 
ignoring  measurement  error  variability.  Mandel  takes  the 
relationship  between  Doming 's  estimator  and  the  least  squares 
estimator  to  be 
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P  -  L3by,x  (l  +  a  f  a  )• 


V  -  3d 


This  allows  us  to  compare  the  variance  of  Denting 's  estimator  with 
the  least  squares  estimator  by 


v**$] 


yar 


LSbrx  (l  + 

y*V  s*  -  sJ)\ 


Analysis  of  Transformed  Data 

In  scanning  these  data,  it  seems  that  the  variances  tend  to 
increase  as  the  measurement  values  increase.  This  can  be  seen 
slightly  from  the  scatterplot.  While  this  may  not  be  large  enough 
to  worry  about#  we  can  deal  with  it  by  splitting  the  ranked  data 
into  two  equal  groups  and  tasting  for  homogeneity  of  variance. 
While  Bartlett's  Test  suggests  a  significant  difference  between  the 
two  variances#  this  test  is  often  considered  too  sensitive. 
Cochran's  test  also  suggests  heterogeneity  of  variances.  In  light 
of  this#  we  can  try  to  achieve  homogeneity  of  variance  with  a  data 
transformation.  According  to  Bartlett  [1947],  if  a2  -  k2m  where  m 
is  the  mean#  then  Jx  is  a  possible  transformation.  If  this  works  to 
correct  the  variance  problem#  the  variance  of  the  transformed  data 
will  be  .25k3.  For  x,  System  A#  the  variance  is  2.38  times  the 
mean.  The  variance  of  the  transformed  data  is  1.45  and  .25 (2. 38) 2 
-  1.42.  Along  with  the  scatterplot  in  Figure  9#  this  suggests  that 
we  now  have  homogeneity  of  variance  for  the  transformed  data. 
Analyzing  the  transformed  data  under  the  assumption  that  X  -  1  we 
have  the  following  results: 


p  -  1,1011 

A 

7[p]  “  .0054 


so  that 


z  -  1.38. 
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Transformed  Data 


zlransj.  1] 

Figure  9  :  Scalterplot  of  S|KTimcn  Measurements  from  System  A  vs  System  B,  n=200 


Using  the  previous  estimate  of  a  -  4,  the  results  are: 


P  «  1.1093 


A 

y[p]  -  .0054 


and  henoe 


2  ■  1.49. 


Next  we  oan  estimate  lambda  again  for  the  transformed  data. 
This  time  *  -  3.2096.  If  we  analyze  the  transformed  data  using  a  - 
3 ,  the  results  are: 


P  ■  1.1080 


A 

-  .0054 


and 


z  =  1.47. 


In  none  of  these  analyses  do  we  find  a  significant  difference 
between  the  two  systems. 

If  we  carry  out  another  least  squares  analysis  on  the 
transformed  data,  the  results  are: 


P  ■  1,0855 
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.00016 


A 

v[p]  - 


and  thus 


2  -  6.81. 


Ones  again  we  observe  that  the  variance  of  the  least  squares 
estimator  is  smaller  than  that  of  the  norm  distance  estimator.  As 
a  result  we  find  significance  when  in  fact  there  is  no  significant 
difference  between  the  systems  and  we  are  led  astray  by  the  least 
squares  method  because  it  does  not  account  for  measurement  error. 

Recommendations 

If  measurement  errors  exist,  then  it  is  best  to  use  an 
appropriate  estimation  technique  making  sure  to  account  for  these 
errors  in  both  variables.  As  we  have  seen  with  the  application, 
ignoring  measurement  error  may  well  result  in  Inaccurate 
conclusions. 

If  this  technique  of  estimation  is  to  be  used,  then  it  is  best 
to  select  a  large  sample  whenever  possible.  We  have  seen  that  when 
n  ■  100  the  density  of  p  is  approximately  normal  for  practically 
all  typical  situations,  and  even  some  less  than  typical.  If  smaller 
samples  are  necessary,  then  it  is  best  to  select  values  of  X  such 
that  the  spread  of  X  is  large  relative  to  o2  and  o2d.  For  samples 
of  size  36,  48,  and  larger  we  have  seen  how  tne  density  reasonably 
resembles  a  normal  one  for  cases  when  c2  was  larger  than  a2  and  o2d. 
However,  for  smaller  samples  there  are  many  cases  in  wnich  the 
density  is  far  from  resembling  a  normal  or  even  a  t  distribution 
even  when  a2  is  quite  larger  than  c2#  and  <x2d.  For  these  cases  we 
have  no  reliable  test  statistic. 

It  is  best  to  sample  replicates  (repeated  measures)  whenever 
possible  in  order  to  estimate  lambda.  As  we  have  seen,  the  often 
recommended  use  of  X  ■  1  adds  additional  bias  to  the  estimate  if  X 
i*  1.  However,  using  X  *»  1  is  better  than  completely  disregarding 
the  measurement  error. 

If  one  is  testing  a  hypothesis  and  obtains  an  estimate  that 
seems  extremely  different  from  that  specified  in  the  hypothesis, 
then  one  should  test  for  a  significant  correlation.  If  the 
correlation  coefficient  is  not  significantly  different  from  zero, 
then  the  estimate  of  p  is  an  unreliable  one.  Hence,  one  should 
resample  if  possible. 
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The  goal  of  this  study  was  to  evaluate  the  design  and 
operational  characteristics  of  the  Small  Area  Camouflage 
Cover  (SACC) ,  when  used  by  ground  soldiers  in  a  tactical 
environment.  The  SACC  is  designed  to  conceal  individuals, 
small  size  equipment  and  fighting  positions.  Fifty-nine 
reserve  soldiers  from  the  187th  Infantry  Brigade,  Fort 
Devens,  MA  were  given  the  SACC  to  be  incorporated  in  their 
training  at  the  Canadian  Forces  Training  Center,  New 
Brunswick,  Canada.  They  were  given  instructions  on  the  use 
of  the  SACC  before  the  start  of  the  maneuvers.  Ten  days 
later,  at  the  conclusion  of  the  exercises,  the  soldiers  were 
presented  a  questionnaire/  survey  of  twenty-two  SACC  design 
and  operational  characteristics,  from  which  they  made 
individual  paired  comparisons  to  determine  which  of  the 
characteristics  were  most  important.  Each  characteristic 
was  independently  evaluated  by  each  soldier  twenty-one  times 
for  a  total  of  two -hundred -and  thirty-one  paired 
comparisons.  A  parametric  statistical  analysis  was 
conducted  upon  the  results  of  the  questionnaire/survey,  and 
six  statistically  significant  (a  *  0.05)  groups  of 
characteristics  were  determined,  with  the  groups  defining  a 
continuum  from  most  to  least  preferred.  This  study  joined 
the  expertise  of  an  engineer,  statistician,  and 
psychologist,  and  gave  the  investigators  a  unique  testing 
challenge  of  obtaining  hard  empirical  data  from  a  subjective 
operational  field  test. 

1.0  SECTION  1  -  INTRODUCTION 

The  Small  Area  Camouflage  Cover  (SACC)  is  a 
continuation  of  a  program  begun  in  1986  to  develop  an 
Individual  Camouflage  Cover  (ICC) .  The  original  program  was 
sponsored  by  FORSCOM  and  resulted  in  prototype  arctic, 
woodland,  and  desert  ICCs .  The  SACC  development  sponsored 
by  the  Soldier  Enhancement  Program,  extended  the  original 
design  by  developing  a  more  effective,  durable,  and 
versatile  camouflage  capability  including  a  cover  for 
tropical  backgrounds . 
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The  SACC  is  designed  to  provide  protection  from  visual, 
near-infrared,  and  radar  observation,  and  in  the  arctic 
version,  also  provides  protection  from  ultra-violet 
detection.  The  SACC  will  conceal  individual  troops,  or  can 
be  attached  together  for  use  over  weapon  emplacements, 
fighting  positions,  and  supply  caches. 

In  designing  the  SACC,  certain  characteristics,  such  as 
color/texture  match  with  background,  lightweight,  low  shine, 
and  durability  were  used  as  guidelines.  In  addition,  18 
other  technical  and  operational  design  characteristics  have 
been  determined  for  the  SACC.  In  order  to  finalize  the 
current  development  and  to  put  emphasis  on  the  most 
important  requirements  in  future  SACC  designs,  the  design 
characteristics  needed  to  be  ranked  in  their  order  of 
importance.  To  determine  the  order  of  importance,  a  troop 
test  was  conducted  using  soldiers  from  the  3rd  Bn,  35th  Inf, 
187th  Bde  from  Fort  Devens,  MA.  The  test  was  conducted 
during  exercise  Noi'dic  Shield  II  at  the  Canadian  Forces 
Combat  Training  Center  near  Qagetown,  New  Brunswick,  Canada 
in  August  1992. 

2.0  SECTION  II  -  EXPERIMENTAL  DESIGN 

2 . 1  Test  Target 

The  test  target  was  a  woodland  SACC  developed  at  Fort 
Belvoir,  VA.  It  is  reversible  with  a  two-color  green 
pattern  on  one  side  and  a  four-color  brown  pattern  on  the 
reverse.  The  SACC  is  made  of  incised,  vinyl  coated  nylon 
scrim,  weighs  less  than  518  grams  (18  ounces),  and  is  small 
enovigh  2.76  x  .1.77  meters  (4'6"  x  7')  to  be  fitted  into  the 
pocket  of  a  soldiers  uniform.  The  SACC  also  has  near- 
infrared  and  radar  camouflage  characteristics. 

2.2  Test  Site 

The  test  site  was  located  at  the  Canadian  Forces  Combat' 
Training  Center,  New  Brunswick  Canada.  The  area  represented 
a  typical  north  temperate  zone  woodland  environment, 
consisting  of  large  open  fields  of  grass  land  and  large 
tracks  of  dense  coniferous  and  deciduous  forests. 

2.3  Teat  Subjects 

A  total  of  59  reserve  troops  from  the  3rd  Bn,  35th  Inf, 
187th  Inf  Bde,  Fort  Devens,  MA  participated  in  the  study. 

The  troops  consisted  of  enlisted  personnel,  non-commissioned 
officers,  and  commissioned  officers. 
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2 . 4  Teat  Procedure 


The  troops  were  issued  25  SACCs  to  be  used  during  their 
tactical  exercise.  These  SACCs  were  eventually  used  by  59 
soldiers.  At  the  conclusion  of  the  exercise,  a 
questionnaire/survey  Table  1,  was  given  to  the  troops,  in 
which  they  made  individual  comparisons  between  22  technical 
and  operational  design  characteristics.  The  procedure 
involved  comparing  each  characteristic  to  all  the  others,  a 
pair  at  a  time.  The  task  was  to  decide  which  of  each  pair 
of  characteristics  was  the  most  important.  Each  soldier 
made  a  total  of  231  paired  comparisons,  with  each 
characteristic  being  evaluated  21  times.  The  comparison  was 
made  as  follows:  If  the  evaluator  preferred  the  column 
characteristic  over  the  row  characteristic,  in  Table  1,  a 
one  was  placed  in  the  box.  If  the  row  characteristic  was 
preferred  over  the  column  characteristic,  a  zero  was  placed 
in  the  box.  The  ones  for  the  row  of  each  characteristic 
were  added  along  with  to  the  number  of  zeros  for  the 
corresponding  column  to  produce  a  total  acceptance  score. 

The  larger  the  acceptance  score  the  more  important  the 
evaluator  felt  about  the  characteristic.  The  soldier  was 
instructed  not  to  skip  any  comparisons. 

3.0  SECTION  XIX  -  RESULTS 

The  soldiers  answering  the  questionnaire/survey 
produced  sufficient  data  to  enable  a  ranking  of  the  subject 
design  and  operational  SACC  characteristics,  from  most 
desired  to  least  desired.  Table  2  shows  the  descriptive 
data  with  the  sample  size,  mean,  standard  deviation, 
standard  error,  and  the  95  percent  confidence  interval  for 
each  characteristic.  Tnble  3  contains  the  analysis  of 
variance  V  2/,  while  Table  4  shows  the  Scheffe's  Multiple 
Range  Procedure  which  separates  the  operational 
characteristic  into  statistically  different  groups.  The 
higher  the  mean,  the  greater  the  preference  the  evaluator 
had  for  the  characteristic.  In  all  cases,  a  code  letter  is 
used  for  the  characteristic  (see  Table  1) . 
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TABLE  1 


SACC  DESIGN  AMD  OPERATIONAL  CHARACTERISTICS 


COD! 

LIUM  oitARAeTMftiiTie 


A. -Woodland,  arctic,  tropic  and  desert 
SACCs  should  ba  raversibla  (i.a., 

.  another  seasonal  or  background  color  on 
back) 

B  -Must  not  make  noisa  whan  baing  handlad 
C  - Shalf  lifa  of  10  years 

0  -SACC  should  bo  no  bigger  than  4  1/2  by  7 

foot 

E  -Must  ba  non-f lammable 

F  -Waight  doas  not  hindar  transport 

U  -  Doas  not  intarfara  with  viiiion 

H  -Must  not  shina  or  glara 

I  -Must  ba  aasily  carriad 

J  -Must  ba  abla  to  ba  joinad  with  othar 
SACC  units  to  placa  ovar  largar  objacts 
such  as  HMMWV  or  gun  position 

K  -Must  ba  fungus  rasistant 

L  - Doas  not  intarfara  with  hand  movement 

M  -Offars  protaction  against  visual 
dataction  (matches  background  color, 
texture,  breaks  up  outline  of  hull  and 
tracks) 

N  -Must  not  snog 

0  -Offers  protection  against  radar 
detection 

P  -Must  not  greatly  increase  the  body 

temperature  of  a  soldier  under  the  SACC 

Q  -Offars  protection  against  thermal 
detection 

R  -Field  lifa  (durability,  color  fading, 
ate . )  of  60  days 

S  -Must  not  present  a  health  hazard 
T  -Must  be  easy  to  use 

U  -offers  protection  against  near- inf rared 
detection 

V  -Must  be  easily  carried  by  Ml  Tank  or 
Bradley 


A  B  C  D  E  P  0  H  I  J  K  L  M  N  0  P  Q  R  S  T  U  V 


B 

C 

D 

E 

V 

a 

H 

I 

J 

K 

L 

M 

N 

0 

P 

Q 

E 

s' 

T 

"L 

v 
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TABLE  2 

MEAN  PREFERENCE  DESCRIPTIVE  DATA  FOR 
THE  SACC  DESIGN  AND  OPERATIONAL  CHARACTERISTICS 


95%  CONFIDENCE 


CHAAAC- 

SAMPLE 

STANDARD 

STANDARD 

INTERVAL 

aBmifi 

SIZE  MEAN 

deviation  error  LQWBB  mUUZZEgE . Vmil 

A 

1239  .5876 

.4925 

.0140 

.5601 

.6150 

B 

1239  .4931 

.5002 

.0142 

.4653 

.5210 

C 

1239  .5278 

.4994 

.0142 

.5000 

.5557 

D 

1239  .6392 

.4804 

.0136 

.6124 

.6660 

E 

1239  .4560 

.4983 

.0142 

.4282 

.4838 

F 

1239  .5738 

.4947 

.0141 

.5463 

.6014 

G 

1239  .5157 

.5000 

.0142 

.4879 

.5436 

H 

1239  .5343 

.4990 

.0142 

.5065 

.5621 

I 

1239  .5214 

.4997 

.0142 

.4935 

.5492 

J 

1239  .4479 

.4975 

.0141 

.4202 

.4757 

K 

1239  .4318 

.4955 

.0141 

.4042 

.4594 

L 

1239  .4019 

.4905 

.0139 

.3746 

.4293 

M 

1239  .6336 

.4820 

.0137 

.6067 

.6604 

N 

1239  .5093 

.5001 

.0142 

.4814 

.5372 

0 

1239  .5609 

.4965 

.0141 

.5333 

.5886 

P 

1239  .4294 

.4952 

.0141 

.4018 

.4570 

Q 

1239  .4512 

.4978 

.0141 

.4234 

.4789 

R 

1239  .5198 

.4998 

.0142 

.4919 

.5476 

S 

1239  .4213 

.4940 

.0140 

.3938 

.4488 

T 

1239  .4044 

.4910 

.0139 

.3770 

.4317 

U 

1239  .5028 

.5002 

.0142 

.4749 

.5307 

V 

1239  .4366 

.4962 

.0141 

.4090 

.4643 

TABLE 

3 

ANALYSIS  OF  VARIANCE  FOR  DESIGN 

AND  OPERATIONAL  CHARACTERISTICS 

PREFERENCE 

DEGREES  OF 

SUM  OF 

MEAN 

SIGNIFICANCE 

Requirement 

FREEDOM 

SQqARBB 

me a 

LEVEL 

21 

127.4798 

6.0705 

24.7248 

0.000* 

Error 

27,236 

6687.0202 

.2455 

TOTAL i 

27,257 

6814.5000 

Bartlett'e  Teet  for  Homogeneous  Variances 

Number  Degrees  of  Freedom  ■  21 
F  ■  0.306  Significance  Level  a  »  0.999 
♦Significant  at  a  less  than  0.001  level. 

Table  3  indicates  that  there  were  significant 
differences  in  the  soldiers  preference  for  the  listed  design 
and  operational  SACC  characteristics.  The  Bartlett's  Test 
indicated  that  the  variance  of.  each  characteristic  is 
homogeneous,  i.e.  not  significantly  different,  so  they  are 
from  the  same  population. 
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The  Scheffe's  Multiple  Range  Test  (Table  4) 
was  used  to  determine  where  these  significant  differences  in 
preferences  occurred.  This  test  separates  a  set  of 
significantly  different  means  into  subsets  of  homogeneous 


means . 

TABLE  4 

SCHEFFB'S 

MULTIPLE 

-  RANGE  TEST 

-  SACC  DESIGN 

AND  OPERATIONAL  CHARACTERISTICS 

PREFERENCE 

WORST 

BEST 

2ML 1 

flftQVF,  a 

SSSSJLA 

GROUP  4 

MOV*  5 

SES3LJS. 

L  .4019 

S  .4213 

J  .4479 

B  .4931 

R  .5198 

c 

.5278 

T  .4044 

P  .4294 

Q  .4512 

U  .5028 

I  .5214 

H 

.5343 

S  .4213 

K  .4318 

E  .4560 

N  .5093 

C  .5278 

0 

.5609 

P  .4294 

V  .4366 

B  .4931 

G  .5157 

H  .5343 

F 

.5738 

K  .4318 

J  .4479 

U  .5028 

R  .5198 

0  .5609 

A 

.5876 

V  .4366 

Q  .4512 

N  .5093 

I  .5214 

P  .5738 

M 

.6336 

J  .4479 

E  .4560 

Q  .5157 

C  .5278 

A  .5876 

D 

.6392 

Q  .4512 

B  .4931 

R  .5198 

H  .5343 

M  .6336 

E  .4560 

U  .5028 

I  .5214 

0  .5609 

B  .4931 

N  .5093 

C  .5278 

F  .5738 

U  .5028 

Q  .5157 

H  .5343 

A  .5876 

N  .5093 

R  .5198 

0  .5609 

0  .5157 

1  .5214 

C  .5278 

H  .5343 

4.0  SECTION  IV  -  OX8CUS8IOH 

The  questionnaire  was  successful  in  determining  which 
design  and  operational  characteristics  were  deemed  most 
important  and  least  important  as  judged  by  the  ground  troops 
(Table  4).  The  most  important  characteristics  for  the  SACC 
were  as  follows: 

•  No  larger  than  4  1/2  by  7  feet 

•  Offer  protection  against  visual  detection 

•  Woodland,  arctic,  tropic  and  desert  SACCs  should  be 
reversible 

•  Weight  does  not  hinder  transport 

•  Offers  protection  against  radar  detection 

•  Must  not  shine  or  glare 

•  Shelf  life  of  10  years 

Each  group  of  characteristics  differs  significantly  a  *  0.05 
from  each  other.  The  six  least  important  characteristics 
were: 


•  Does  not  interfere  with  hand  movement 

•  Must  be  easy  to  use 

•  Must  not  present  a  health  hazard 
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•  Must  noc  greatly  increase  the  body  temperature  of  the 
soldier  under  the  SACC 

•  Must  be  fungus  resistant 

•  Material  is  durable 

Note  that  most  of  the  characteristics  overlap  into  adjoining 
groups.  However,  there  were  few  surprises  among  the  most 
preferred  characteristics.  As  expected  the  SACC  should  be 
small,  lightweight  and  be  able  to  blend  with  the  background, 
hence  reversible.  The  least  preferred  characteristics  of 
not  being  a  health  hazard,  easy  to  use,  durable,  and  not 
interfere  with  hand  movements  give  an  insight  into  the 
soldiers  thoughts  on  camouflage.  That  is,  if  it  works  and 
is  not  to  hard  to  carry  the  soldier,  will  put  up  with 
hardships . 

The  following  requirements  fell  into  the  middle  range 
of  preference,  i.e.,  groups  3  and  4: 

•  Must  not  make  a  noise  when  being  handled 

•  Offers  protection  against  near-infrared  detection 

•  Must  not  snag 

•  Does  not  interfere  with  vision 

•  Field  life  (durability,  color,  fading,  etc.)  of  60 
days 

•  Must  be  easily  carried 

•  Shelf  life  of  10  years 

•  Must  not  shine  or  glare 

•  Offers  protection  against  radar  detection 

•  Weight  does  not  hinder  transport 

•  Must  be  able  to  be  joined  with  other  SACC  units  to 
place  over  larger  objects 

•  Offers  protection  against  thermal  detection 

•  Must  be  .non-flammable 

The  proper  identification  of  important  and  not  important 
characteristics  precludes  the  possibility  of  incorrectly 
assigning  resources  to  a  characteristic  which  has  little 
practical  importance.  A  good  example  of  this  would  possibly 
be  characteristics  S  (must  not  present  a  health  hazard)  and 
Q  (does  not  interfere  with  vision) . 

5.0  S8CTI0N  V  -  SUMMARY  AND  CONCLUSIONS 

A  total  of  59  soldiers  from  the  3rd  Bn,  35th  Inf,  187th 
Inf  Bde,  Fort  Devens,  MA  participated  in  the  study.  During 
their  field  training,  they  used  the  SACC  to  conceal 
individual  troops,  weapon  emplacements,  fighting  positions, 
and  supply  caches.  Upon  completion  of  the  exercises,  they 
were  given  a  questionnaire/aurvey  in  which  the  soldiers  made 
individual  comparisons  between  22  design  and  operational 
characteristics.  Their  task  was  to  decide  which  of  each 
pair  of  characteristics  was  the  most  important.  Each 
subject  made  a  total  of  231  paired  comparisons,  with  each 
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characteristic  being  evaluated  21  times.  A  review  of  the 
data  indicated  that  the  statistical  procedures  enabled  the 
investigators  to  determine  the  most  important  and  least 
important  characteristics.  Logical  decisions  on  how  to 
expend  resources  on  the  development  of  new  camouflage  can 
now  be  determined  from  what  otherwise  would  be  viewed  as  a 
large  pool  of  subjective  responses  out  of  which  little 
objective  conclusions  could  be  determined. 
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TYee-structured  Statistical  Methods 
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Abstract 

Reetat  d«v*lopm«nti  in  tre«-»truetur*d  mtthoda  art  revitwtd  with 
amphasia  on  txttniiblt  and  computationally  efficient  itrattgiai, 


1  Introduction 

Tree-etructured  methods  are  compute-intensive  statistical  procedures  that 
yield  decision  trees  sa  notations  for  classification  and  regression  problems. 
Two  early  methods  are  ths  AID  and  THAID  (Morgan  and  Sonqulst,  1963; 
Morgan  and  Messenger,  1973)  computer  program*  for  rcgrssalon  and  clas¬ 
sification.  These  methods  construct  binary  decision  trass  by  racurslvsly 
partitioning  a  data  sat.  At  each  stage,  all  possible  splits  of  the  data  In  the 
partition  are  examined  to  find  one  that  maximally  reduces  node  impurity, 
where  impurity  le  defined  in  terms  of  entropy  or  mean  equare  error.  AID  and 
THAID  were  later  superceded  by  CART  (Brelman,  Friedman,  Olshen  and 
Stone,  1984),  whose  most  important  contribution  wae  a  method  of  “prun¬ 
ing”  to  get  a  tree  of  approximately  ths  right  lias.  CART,  howsver,  adopted 
the  slow  split-finding  atrategy  of  its  prsdscsssori. 

The  FACT  (Vanlchaetakul,  1986;  Loh  and  Vankchsetakul,  1988)  method 
uees  standard  linear  statistical  techniques  such  as  linsar  discriminant  analy¬ 
sis  and  analysis  of  varianca  testa  to  find  splits.  It  also  uses  a  direct  stopping 
ruts  similar  to  that  In  AID  and  THAID,  instead  of  pruning.  As  a  result, 
although  FACT  usually  performs  well  in  many  applications,  datasete  can  be 
constructed  to  fool  it.  Further,  being  baaed  on  linear  discriminant  analysis, 
FACT  dose  not  always  give  binary  splits;  it  splits  each  node  into  as  many 
aubnodei  as  there  are  classes.  On  the  other  hand,  the  speed  of  FACT  is 
usually  ten  to  eeveral  hundred  times  faster  than  CART'e. 


2  Main  results 

Several  new  algorithms  have  been  developed  recently  at  the  University  of 
Wisconsin  that  combine  the  pruning  method  of  CART  with  the  fast  splitting 
method  of  FACT.  These  algorithms  share  a  common  philosophy  of  sacrlfic* 
Ing  local  spilt  optimality  for  computational  speed  and  ease  of  extensibility  to 
generalised  regression  settings,  Because  of  their  ability  to  lit  complex  model* 
quickly,  the  statistical  accuracy  of  these  methods  Is  typically  as  good  as,  If 
not  better  than,  CART’s.  Shlh  (1993)  develops  a  llkellhood-based  method  of 
split  selection  for  categorical  variables  and  a  method  of  grouping  more  than 
two  classes  Into  two  superclasses  to  allow  binary  splits,  Chaudhurl,  Huang, 
Loh  and  Yao  (1994)  describe  a  method  of  tree-structured  regression  that 
yields,  if  desired,  smooth  estimates  of  the  function  and  Its  derivatives.  Con¬ 
ditions  for  asymptotic  consistency  of  the  estimates  are  provided.  Chaudhurl, 
Lo,  Loh  andi  Yang  (1993),  Lo  (1993)  and  Yang  (1993)  generalise  these  Ideas 
to  tree-structured  Poisson  regression  and  logistic  regression  models.  Ex¬ 
tensions  to  stratified  regression  modeling  of  censored  data  using  piecewise 
parametric  and  non  parametric  models  (such  as  Weibull  and  proportional 
haaards  models)  aro  reported  In  Loh  (1991),  Ahn  (1992)  and  Ahn  and  Loh 
(1994). 

The  key  Ideas  may  be  summarised  as  follows, 

1,  Use  of  a  grouping  procedure  if  necessary  to  combine  classes  Into  two 
superclasses  at  each  nod*  prior  to  splitting.  This  ensures  binary  splits. 

2,  Use  of  two-sample  f-tests  for  differences  between  means  and  variances 
to  select  the  variable  to  split  a  node,  In  the  case  of  univariate  splits. 
These  tests  are  also  used  to  detect  patterns  In  residual  plots  to  guide 
split  selection  In  regression. 

3,  Us*  of  linear  discriminant  analysis  to  determine  the  best  linear  com¬ 
bination  split  or  the  best  univariate  split  on  the  selected  variable. 

4,  Use  of  CART’s  pruning  method  to  determine  the  Anal  size  of  the  tree. 

.1' 

5,  Use  of  linear  projections  with  dummy  variable  coding  to  convert  cat¬ 
egorical  variables  into  ordered  variables  before  splitting. 

6,  Use  of  maximum  likelihood  fitting  for  piecewise  generalized  regression. 

7,  Us*  of  weighted  averaging  to  produce  smooth  estimates  of  the  function 
and  it*  derivatives. 


126 


The  details  of  the  algorithms  will  be  reported  elsewhere.  The  practical 

advantages  of  this  strategy  over  CART  are; 

1.  Computational  speed.  CART  finds  linear  combination  splits  on  or¬ 
dered  variables  by  global  optimisation  over  all  coefficients  in  the  linear 
combination.  Our  method  is  much  more  efficient  because  it  uses  linear 
discriminant  analysis.  In  the  case  of  regression,  CART  fits  a  model  to 
each  subnode  for  every  split  considered.  Since  it  examines  all  possible 
splits,  this  process  is  very  time  consuming.  Our  approach  fits  a  model 
to  each  subnode  only  after  a  split  is  selected.  Hence  model  fitting  is 
performed  only  once  at  each  node. 

2.  Treatment  of  categorical  variables.  To  find  the  best  split  on  a  categor¬ 
ical  variable,  CART  searches  over  all  subsets  of  categories.  Because 
the  number  of  such  splits  increases  oxponentially  with  the  number  of 
categories,  this  is  also  a  very  time  consuming  process.  Another  prob¬ 
lem  Is  that  this  strategy  tends  to  prefer  splits  on  categorical  variables 
with  many  categories  over  splits  on  ordered  variables.  Our  approach 
of  converting  each  categorical  variable  into  an  ordered  variable  avoids 
this  problem  and  speeds  up  split  selection, 

3.  Boolean  combination  splits  on  categorical  variables.  This  can  be  quickly 
obtained  via  linear  combinations  of  transformed  categorical  variables. 
Qlobal  optimization  strategies  are  Impractical  because  of  the  large 
number  of  splits  that  need  to  be  evaluated. 

4.  Versatility  in  model  fitting.  Because  model  fitting  is  performed  af¬ 
ter  split  selection,  models  of  arbitrary  complexity  (such  as  GLIM  or 
proportional  hazards  models)  may  be  fitted  to  each  node  at  little  ad¬ 
ditional  cost. 
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This  is  a  olinioal  paper  addressing  means  to  combine  the  results  of 
a  number  of  studies  on  two  simulation  models,  the  desired  result  of 
which  is  to  identify  a  balanced  cost-effective  set  of  survivability 
enhancements  for  a  direct-fire  armored  weapon  system  at  an 
acceptable  risk,  so  that  these  enhancements  may  be  made  a  part  of 
the  engineering  specifications  for  the  weapons  system.  Passive 
survivability  elements  consist  of  ballistic  protection  measures, 
and  signature  reduction  ir  the  areas  of  RF,  visual,  and  thermal 
spectra.  Countermeasures  considered  cover  smoke,  receivers, 
jammers,  and  active  protection  systems.  The  intent  is  to  maximize 
the  use  of  passive  measures,  avoiding  high  technology,  high  risk 
solutions ,  and  avoiding  highly  sophisticated  active 
countermeasures . 

SIMULATION  MODELS 

Simulation  models  are  used  in  the  study  to  evaluate  the 
effectiveness  of  the  system  in  combat  given  the  enhancements  of  the 
suites  of  countermeasures,  signature  reduction,  and  ballistic 
protection.  The  two  models  are  the  GROUNDWARS  few-on-few  direct 
fire  and  artillery  simulation,  and  the  Combined  Arms  and  Support 
Task  Force  Evaluation  Modal  (CASTFOREM) ,  a  many-on-many  battlefield 
simulation. 

GROUNDWARS ,  maintained  by  the  Army  Materiel  Systems  Analysis 
Activity  (AMSAA)  is  used  primarily  to  evaluate  weapon  system 
effectiveness  by  representing  land  combat  between  homogeneous 
forces,  where  the  total  number  of  combatants  cannot  exceed  twenty, 
and  where  the  systems  hays  a  limited  representation  of  sensors  and 
munitions.  A  statistical  terrain  is  represented.  GROUNDWARS  is 
stochastic  employing  Monte  Carlo  probability  theory  as  its  primary 
solution  technique;  three  hundred  replications  of  a  case  are 
normally  employed. 

CASTFOREM,  maintained  by  the  TRADOC  Analysis  Center  -  WSMR  (TRAC- 
WSMR)  is  a  stochastic,  event  sequenced,  force-on-force  simulation 
of  ground  combat  involving  up  to  a  BLUE  brigade  and  opposing  RED 
forces.  It  is  used  for  weapon  system  trade-off  analyses, 
investigation  of  alternate  tactics,  v .  aetric  analyses  of  selected 
weapon  system  performance  parameters,  and  other  similar  studies. 
CASTFOREM  is  extremely  flexible,  and  can  accommodate  any  terrain  or 
weapon  system  for  which  data  is  available.  Terrain  used  is 
digitized  actual  terrain.  Weather  and  ambient  light  conditions  are 
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constant  throughout  a  battla.  Battlefield  obscurants ,  smoke,  and 
dust  are  modeled  as  dynamic  clouds.  Processes  are  modeled 
probabilistically  using  Monte  Carlo  techniques;  the  model  is 
stochastic  event  sequenced,  although  time-step  events  are  possible. 
Normally,  21  replications  of  a  case  are  employed. 

COUNTERMEASURES  PLAYED  IN  GROUNDWARS 

In  the  following  table  are  described  the  variation  of 
countermeasures  suites  for  the  system  examined  through  the  use  of 
the  GROUNDWARS  model.  Those  with  a  'Y'  in  the  cell  indicates  the 
suite  was  examined. 

The  scenarios  used  are  the  following: 

SCEN  A:  This  scenario  represents  a  BLUE  mechanized  infantry 
task  force  in  a  prepared  defense  against  an  overwhelming 
modern  RED  armor  attack.  The  setting  is  Central  Europe, 
Winter,  with  snow  on  the  ground  and  7  kilometers  visibility. 

SCEN  B:  This  scenario  rspresents  a  meeting  engagement  between 
a  BLUE  mechanized  infantry  brigade  and  a  modern  RED  tank 
regiment..  The  setting  again  is  Central  Europe  in  Winter  with 
snow  and  7  kilometers  visibility. 

SCEN  C>  This  scenario  is  set  in  late  spring  in  Southwest  Asia, 
dusty  with  14  kilometers  visibility.  The  BLUE  force  is  a 
mechanized  infantry  battalion  (+)  in  a  hasty  defensive  posture 
encountering  two  Threat  tank  battalions  equipped  with  current 
equipment.  Threat  countar-maneuver  artillery  is  minimal. 

The  countermeasures  described  in  the  table  are  as  follows: 

LWR:  Laser  warning  receiver  -  detects  when  the  system  is  being 
lased  by  a  threat  rangefinder  or  detects  a  missile  guidance 
laser . 

MWS:  Missile  warning  system/muzzle  flash  detector  -  detects 
the  launch  of  a  missile  or  the  flash  of  a  gun. 

RWR:  Radar  warning  receiver  -  indicates  when  being  painted  by 
radar. 

SMK:  Signifies  the  employment  of  self -protective  smoke  in  the 
visual,  infrared,  and  millimeter  wave  spectra  in  the  direction 
of  the  perceived  threat  munition. 

JAM:  Infrared  Jammer  -  disrupts  the  infrared  tracking  beacon 
on  an  incoming  missile 

SLID:  Small,  lov*-cost  intercept  device  -  a  proposed  counter¬ 
missile  system. 

SHORTSTOP:  A  proposed  artillery  countermeasure  device. 
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Several  suites  are  not  considered  for  the  following  reasons: 

a.  Smoke  would  not  be  used  in  the  SCEN  A  prepared  defensive 
position/  except  if  the  system  were  on  the  move  and  exposed. 
Smoke  was  not  used  in  GROUNDWARS  (but  was  used  in  CASTFOREM. ) 

b.  In  SCEN  A/  SCEN  B  and  SCEN  C  the  LWR  alone  would  not  pick 
up  the  threat  missiles  to  JAM,  so  the  cases  were  not  played. 

.  o.  In  SCEN  B  and  SCEN  C  the  MWS  alone  is  unable  to  detect  the 
target  and  cue  smoke,  so  that  combination  was  not  played. 

d.  In  SCEN  C  there  was  no  radar  threat  portrayed,  so  RWR  was 
not  considered. 

e.  In  Seen  C  there  was  no  artillery  affecting  the  system,  so 
SHORTSTOP  was  not  played. 

COUNTERMEASURE  SUITES  IN  CASTFOREM 

Using  the  same  chart  for  the  countermeasure  suites  played  in 
GROUNDWARS,  the  suites  that  were  evaluation  in  CASTFOREM  are 
denoted  with  a  'C ' . 


Table  1:  COUNTERMEASURE  SUITE  COMBINATIONS 
GROUNDWARS  and  CASTFOREM 


POSSIBLE  SUITE 


BASELINE  (No  CM) 


SCEN  A 


SCEN  B 


SCEN  C 


RWR,SMK 


RWR, JAM 


RWR,SMK, JAM 


LWR, MWS 


LWR,MWS,SMK 


LWR, MWS, JAM 


LWR, MWS, SMK, JAM 


LWR, RWR 
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SIGNATURE  REDUCTION  IN  GROUNDWARS  and  CASTFOREM 

Thraat  systems  employ  a  varlaty  of  means  to  datact  and  to  bring 
fira  onto  our  aystam  of  oonaarn.  Thaaa  maana  ara  diraot  viaw 
optica,  low-light  talaviaion  ayatama,  tharmal  imaging  ayatama, 
ground  aurvaillanoa  radara,  and  aaakara  in  "smart"  artillary 
munitiona.  If  tha  ability  of  tha  ayatam  to  ba  dataotad  or  to  ba 
accurately  pinpointad  wara  raducad,  ita  aurvivability  would  ba 
anhanoad.  It  ia  poaaibla  to  raduca  tha  aignatura  of  tha  vahicla 
through  tha  uaa  of  varioua  auitaa  of  ooatinga  and  ahaping.  Tha 
aignatura  raduotion  auitaa  ara  rapraaantad  on  tha  vahiola  by 
apacifying  an  avaraga  dataction  ranga  raduotion  achiavabla  againat 
tha  array  of  throat  aanaora  rapraaantad  in  tha  scanarioa. 
Probability  of  dataction  by  threat  aaakara  aa  a  function  of  slant 
ranga  and  targat  aignatura  were  determined  uaing  Booz-Allan  and 
Hamilton's  Deaktop  Radar  and  Infrared  Signature  Model.  This 
reduction  ia  as  meaaured  by  tha  NVEOL  aenaor  curves  for  the  threats 
of  interest;  the  point  where  the  probability  of  detect  curve  ia  50% 
(Pdet»0.5)  for  tha  ranga  deaired  waa  taken  aa  the  target  reduction 
criteria.  (The  Johnson  criteria  of  one  cycle  waa  used  for 
detection.)  Five  levela  of  aignatura  reduction  were  played  in  both 
tha  GROUNDWARS  and  CASTFOREM  models,  and  were  designated  Level  A 
through  Level  E. 
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Table  2 I  SIGNATURE  REDUCTION 

IN  GROUND 

WARS  and  CA 

3TFOREM 

DETECTION  RANGE 

SCEN  A 

SCEN  B 

SCEN  C 

BASELINE  SYSTEM 

YC 

YC 

YC 

Level  A 

YC 

YC 

YC 

Level  B 

YC 

YC 

YC 

Level  C 

YC 

YC 

YC 

Level  D 

YC 

YC 

YC 

Level  E 

YC 

YC 

YC 

BALLISTIC  PROTECTION  IN  GROUNDWARS  AND  CASTFOREM 


The  ballistic  protection  suites  added  to  the  vehicle  are  in 
addition  to  the  base  armor  package  inherent  with  the  system.  This 
added  ballistic  protection  would  be  against  direct  fire  kinetic 
energy  and  chemical  energy  munitions,  and  indirect  fire  (artillery) 
munitions.  Due  to  the  differences  inherent  in  the  penetrations  from 
direct  fire  rounds  and  indirect  fire  munitions  cause  the  ballistic 
paakagas  to  be  considered  separately,  although  there  would  be  a 
carry-over  effect  (synergy)  one  to  the  other.  Three  levels  of 
ballistic  protection  were  considered,  based  on  the  probability  of 
resisting  a  system  kill  (as  defined  by  an  analysis  using  the  Army 
Research  Laboratory  CAD  and  evaluation  models)  given  a  hit,  of  50%, 
75%,  or  95%  (given  the  percentage  is  higher  than  the  standard  armor 
package.)  (an  attempt  to  design  a  paokage  that  would  withstand  the 
impaot  of  large  calibre  direct  fire  munitions,  or  a  direct  impact 
of  artillery  HE  was  not  considered. )  These  packages  are  limited  by 
the  power,  weight,  and  dimensional  constraints  of  the  system. 


Table  3.  BALLISTIC  PROTECTIOl 

Nf  IN  GROUNDWARS  and  CASTFOREM 

BALLISTIC  PROTECTION 

SCEN  A 

SCEN  B 

SCEN  C 

BASELINE  SYSTEM 

YC 

YC 

YC 

50%  DF,  50%  IF 

YC 

YC 

YC(DF) 

50%  DF,  75%  IF 

YC 

YC 

50%  DF,  95%  IF 

YC 

YC 

75%  DF,  50%  IF 

YC 

YC 

YC(DF) 

75%  DF,  75%  IF 

YC 

YC 

75%  DF,  95%  IF 

YC 

YC 

95%  DF,  50%  IF 

YC 

YC 

YC(DF) 

95%  DF,  75%  IF 

YC 

YC 
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1  95%  DF,  95%  IF 

YC 

YC 

COMBINED  RUNS  IN  CASTFOREM 

A  number  of  rune  are  scheduled  In  CASTFOREM  to  evaluate 
combinations  of  signature  reduction,  countermeasures ,  and  ballistic 
protection.  To  date,  only  cases  using  the  countermeasure  suite  of 
LWR,  MWS,  RWR,  and  SMK  at  reduced  signature  levels  A,  B,  and  C, 
using  the  base  level  of  ballistic  protection  are  scheduled,  other 
cases  will  be  considered  as  a  result  of  preliminary  analyses; 
additional  combined  case  rune  are  welcomed  from  the  panel. 

i 

Measures  of  effectiveness 

Common  measures  of  effectiveness  output  by  OROUNDWARS  and  CASTFOREM 
are  system  kills,  system  loss,  RED  force  loss,  BLUE  force  loss,  the 
system  exchange  ratio  (system  kills/system  loss),  force  loss 
exchange  ratio  (RED  foroe  loss /BLUE  foroe  loss),  and  surviving 
maneuver  force  ratio  (RED  maneuver  force  (initial-final) /BLUE 
maneuver  foroe  (initial-final)).  In  CASTFOREM  these  measures  are 
available  over  time  for  each  replication.  A  metric  which  aould 
handle  the  synergy  of  the  battle  over  time,  and  the  contribution 
differences  of  the  system  in  various  parts  of  the  battle  by  virtue 
of  its  survival  is  envisioned.  In  these  scenarios,  the  early 
contribution  of  a  system  could  cause  it  to  expend  its  ammunition 
early,  and  so  not  contribute  later  in  the  battle.  However,  because 
of  its  early  contribution,  more  BLUE  direct  fire  systems  could 
survive  and  participate  strongly.  The  more  survivable  system's 
contribution  could  be  swamped  by  the  end  of  the  battle  due  to  the 
synergy.  Therefore,  a  combined  metric  is  visualized. 

Then,  once  the  systems  providing  the  most  potential  are  determined 
(ranked?)  by  their  performance  in  the  simulations,  additional 
factors  must  be  considered,  such  as  the  following i 

a.  Cost 

b.  Weight  and  size  constraints  placed  on  the  system 

c.  Technological  risk  and  possible  fielding  date 

The  intention  is  to  provide  the  Army  with  a  robust  point  solution 
package  to  enhance  the  survivability  and  performance  of  the  system 
and  the  foroe. 

EPILOGUE 

Since  the  conference  in  October  1993,  the  method  of  analysis  used 
was  to  separate  OROUNDWARS  and  CASTFOREM  except  as  the  findings 
were  mutually  supporting,  and  use  the  results  from  CASTFOREM  as  the 
principle  effectiveness  determiner.  The  final  full  factorial 
CASTFOREM  runs  matrix  consisted  of  the  European  Defense  and  the  SWA 
Meeting  Engagement  scenarios,  two  levels  of  signature  reduction, 
three  countermeasure  suites,  three  levels  of  ballistic  protection, 
which  when  added  to  the  base  level  of  each  factor  (no  signature 
reduction,  no  countermeasure  suite,  basic  level  of  ballistic 
protection)  resulted  in  a  96-case  matrix  (2  X  3  X  4  X  4.)  The  final 
product  of  this  analysis,  known  as  the  LOSAT  Survivability 
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Requirements  Study,  should  be  available  after  the  First  of  April, 
1994  from  the  Technical  Management  Division,  LOSAT  Project  Offioe, 
U.S.  Army  Missile  Command,  Attn:  SFAE-ASM-LS ,  Redstone  Arsenal, 
Huntsville,  Alabama  35898-8051 


135 


A  REDISCOVERY  OF  THE  HODGES -LEHMANN  ESTIMATE 


PAUL  H.  THRASHER 

Analysis  Branch  of  Analysis  Division 
Material  Test  Directorate 
White  Sands  Missile  Range,  New  Mexico  88002-5157 


ABSTRACT .  The  Hodgas-Lehmenn  estimate  was  proposed  in  1963  to 
balance  the  risks  of  estimating  too  high  and  too  low.  It  was 
recently  rediscovered  in  a  consideration  of  what  estimate  should  be 
used  after  a  one-dimensional  null  hypothesis  is  rejected. 

1.  INTRODUCTION.  Use  of  data  to  find  a  point  estimate  for  a 

Saramater  is  often  requested,  if  no  standard  or  unacceptable  value 
s  provided  for  the  parameter,  a  point  estimate  is  often  found  to 
(1)  stand  alone  or  (2)  serve  as  the  midpoint  of  a  confidence 
interval.  If  (1)  a  standard  and  unacceptable  value  exist  and  (2) 
the  data  and  agreed  upon  Type  I  and  Type  II  risks  imply  rejection 
of  the  null  hypothesis  that  the  parameter  meets  the  standard,  then 
the  next  question  is  often  "How  badly  does  the  parameter  miss  the 
requirement?"  .  Although  a  p-value  and  a  post-test  Type  II  risk  can 
answer  this  question,  a  point  estimate  is  often  requested  by 
managers  not  versed  in  statistical  language. 

If  a  point  estimate  is  needed,  an  analyst  may  well  want  to 
present  a  more  statistically  justified  number  than  the  aommonly 
used  average  or  sample  median.  One  general  technique  is  to  extend 
hypothesis  testing.  This  was  (1)  done  in  reliability  studies  at 
White  Sands  Missile  Range  with  arguments  described  in  sections  2-5 
of  this  paper,  (2)  presented  as  a  clinical  paper  to  the  Thirty- 
Ninth  Conference  on  the  Design  of  Experiments  in  Army  Research, 
Development,  and  Testing  and  (3)  recognized  by  one  of  the  panelists 
as  the  Hodges -Lehmann  technique. 

2.  RATIONALE.  Any  point  estimate  necessarily  has  limited 
information. It  should  be  made  as  meaningful  as  possible. 

A  point  estimate  might  be  too  high  or  too  low.  An  intuitive 
approach  is  to  adopt  a  goal  of  equal  likelihood;  that  is,  try  to 
equalize  the  risks  of  estimating  too  high  and  too  low. 

One  way  to  approach  this  goal  is  to  think  of  two  hypothesis 
tests  that  (1)  share  the  common  null  hypothesis  of  "The  Desired 
Parameter  Equals  The  Point  Estimate"  and  (2)  have  the  opposing 
alternate  hypotheses  of  "The  Point  Estimate  Is  Less  Than  The 
Desired  Parameter"  and  "The  Point  Estimate  Is  Greater  Than  The 
Desired  Parameter"  as  the  upper  and  lower  alternatives  to  the  null. 
Since  the  p-value  is  the  probability  of  being  wrong  if  the  null  is 
rejected,  the  goal  of  equal  likelihood  can  be  approached  by 
adjusting  the  point  estimate  in  these  two  thought  hypothesis  tests 
until  their  p-values  from  data  are  as  close  to  each  other  possible. 
This  forces  both  p-values  toward  one  half. 


137 


Clearly  each  p-value  can  be  forced  to  exaotly  one  half  if  the 
thought  hypotheses  tests  have  p-values  that  are  continuous  with  the 
always  continuous  point  estimate.  If  such  thought  hypotheses  tests 
are  not  appropriate  for  the  data,  an  average  can  be  baken  of  the 
two  point-estimates  that  make  the  two  p-values  closest  to  one  half. 

The  resulting  estimate  can  logically  be  named  and  described  by 
the  acronym  p-vulte  (p-value  upper  &  lower  test  estimate).  "P- 
Vulte"  can  be  thought  of  linguistically  as  a  noun;  but  it  is  more 
informative  if  it  is  considered  as  an  adjective  in  a  three  word 
title.  For  example,  "Gaussian  p-vulte  mean"  or  "Wilcoxon 
p-vulte  median"  denote  both  the  distribution  that  describes  the 
data  and  the  parameter  that  is  being  estimated. 

3.  NON- INNOVATIVE  RESULTS.  For  data  from  populations 
described  by  some  common  distributions,  use  of  the  p-vulte 
technique  yields  nothing  new.  For  example,  the  Student's  t 
p-vulte  mean  is  simply  the  sample  average. 

This  result  is  obtained  by  considering  the  two  areas,  of  the 
probability  density  funation,  that  are  separated  by  the  desired 
estimate.  Adjusting  these  areas  until  they  are  equal  makes  both  of 
them  exactly  equal  to  one  half.  At  this  point,  the  test  statistic 
t  is  zero.  The  well  known  expression  for  t, 

_ r Sample  Average  -  Population  Mean! _  , 

[Sample  Standard  Deviation  /  Square  Root  of  Sample  Size] 

immediately  yields  the  p-vulte  mean  to  be  the  sample  average. 

4.  BIASSED  RESULTS.  Non-symmetrio  probability  density 
functions  lead  to  biassed  p-vultes.  This  bias  tends  to  zero  as  the 
sample  size  becomes  very  large.  One  example  is  the  binomial  p- 
vulte  R  why re  R  is  the  reliability  (i.e.,  the  probability  of  one 
success  in  one  trial). 

Calculation  of  this  p-vulte  is  direct  in  concept;  but  in 

fractice  it  requires  a  computer.  Equating  the  two  p-values 
s  the  same  as  equating  two  sums  of  b( j ;n, p-vulte)  where  b  is 
the  funation  for  the  binomial  probability  distribution;  x  is 
the  number  of  successes  out  of  n  trials;  one  sum  ranges  from 
j«0  to  j-x;  and  the  other  sum  ranges  from  j-x  to  j-n.  After 
data  is  taken,  the  only  unknown  in  the  equation  is  the  p-vulte. 
Clearly  a  numerical  solution  is  possible;  but  the  existence  of  two 
sums  causes  difficulties.  Calculation  is  facilitated  by 
(1)  pulling  the  term  b(x;n, p-vulte)  out  of  both  of  the  two  equal 
sums,  (2)  remembering  that  the  sum  of  b(j;n,R)  from  j-0  to  j-n  must 
be  one  for  any  R,  and  (3)  arriving  at  the  calculation  equation  of 

1  «  b(x;n, p-vulte)  +  TWICE  THE  SUM  OVER  b( j ;n, p-vulte ) ; 

this  sum  ranges  either  from  j-0  to  j-(x-l)  or  j-(x+l)  to  j-n. 
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The  bias  may  be  illustrated  with  an  example.  For  a  sample 
size  n  of  25,  different  values  of  x  yield  the  binomial  p-vulte  R 
and  the  maximum  likelihood  estimator  (i.e.,  x/n)  to  be 


X 

D-vulte 

x/n 

1 

.0453 

.0400 

3 

.1247 

.1200 

7 

.2828 

.2800 

12 

.4802 

.4800 

13 

.5198 

.5200 

18 

.7172 

.7200 

22 

.8753 

.8800 

24 

.9547 

.9600 

All  of  these  p-vultes  are  biassed  towards  the  aentral  possible 
value  of  R  (i.e.,  0.500).  The  shifting  is  greatest  for  values  of 
x  that  are  farthest  from  corresponding  to  x/n  ■»  0.500. 

As  an  aside,  consider  the  situation  when  x-0  or  x«n.  There 
are  two  possible  interpretations  for  the  binomial  p-vulte  R.  Just 
looking  at  the  summation  equations  and  plots  of  the  distributions 
for  different  possible  values  of  R  suggest  that  these  p-vultes  are 
unbiased  (i.e.,  identically  zero  and  one).  However,  looking  at  the 
underlying  the  thought  hypothesis  tests  suggests  that  these  p- 
vultes  are  undefined.  This  occurs  because  there  is  no  physical 
alternative  that  the  estimate  should  be  lower  than  0  or  higher  than 
n.  Unless  a  limiting  procedure  is  considered,  the  binomial  p-vulte 
R  is  thus  undefined  when  x«0  or  x-n.  One  philosophical 
interpretation  is  that  being  undefined  is  not  bad  in  this 
situation;  that  is,  neither  perfection  nor  total  failure  should  be 
claimed  for  the  population  just  because  data  from  any  sample  fails 
to  indicate  differently. 

Finally,  another  example  shows  the  tendency  of  the  bias  to  be 
removed  with  large  sample  sizes.  Choosing  x's  and  n's  such  that 
x/n  is  1/5  yields 


.  n 

D-vulte 

5 

.2161 

10 

.  2090 

25 

.2038 

75 

.2013 

250 

.2004 

1000 

.2001 

This  table  exhibits  the  tendency  of  consistency. 

Si . JTOBVgT  MIL— SMgJJJYE  RETOTg..  Application  of  the 

p-vulte  technique  to  the  Wilcoxon  signed  ranks  T  test  yields  robust 
and  sensitive  results.  This  should  be  expected  because  the 
Wilcoxon  signed  ranks  T  test  is  well  know  for  its  high  power. 
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The  Wilcoxon  p-vulte  median  may  be  calculated  without  first 
calculating  actual  p-values.  This  is  based  on  the  way  that  the 
Wilcoxon  p-value  calculation  is  done  using  a  thought  experiment: 

(1)  Consider  n  chips  for  the  n  data  of  the  sample; 

(2)  Label  each  chip  with 

(a)  the  sample  rank  of  the  absolute  value  of  the 
difference  between  the  standard  &  the  datum  and 

(b)  a  +  sign  on  top  if  the  standard  exceeds  the  datum 
but  on  the  bottom  if  the  datum  is  the  biggest; 

(3)  Calculate  T+  by  summing  the  ranks  oh  the  chips  where 
the  standard  exceeds  the  datum  (i.e.,  + ' a  are  up); 

(4)  Think  of  tossing  all  chips; 

(5)  Consider  all  two  to  the  nth  power  possible  landings; 

(6)  Count  the  number  of  possible  landings  for  which  the 
sum  of  the  ranks  of  chips  with  plus  sides  up  is  less 
than  the  T+  result  of  step  3  (i.e.,  count  possible 
results  that  are  as  bad  or  worse  than  the  data); 

(7)  Find  the  p-value  by  dividing  the  result  of  step  6 
by  the  result  of  step 

[Note:  See  the  appendix  for  a  discussion  of  handling  ties.] 

This  finds  the  probability  of  being  wrong  in  rejecting  the  null 
hypothesis,  that  the  median  equals  the  standard,  in  favor  of  the 
alternate  hypothesis  that  the  median  is  higher  than  the  standard. 
The  p-value  for  the  other  alternate  hypothesis,  that  the  median  is 
lower  than  the  standard,  can  be  found  by  ohanging  "less  than  or 
equal  to"  in  step  6  to  "greater  than  or  equal  to".  Obviously,  the 
counting  in  step  6  is  tedious  and  time  consuming  for  even  a 
computer  when  n  is  large  and  the  standard  is  near  the  middle  of  the 
data.  Fortunately,  it  is  not  necessary  to  find  the  p-vulte  by  the 
direot  approach  of  guessing  "standards"  until  one  is  found  that 
yields  equal  p-values  for  the  two  alternate  hypotheses.  The 
shortcut  Is  bated  on  features  of  the  number  line  and  the  p-value: 

(A)  The  upper  and  lower  alternate  hypotheses  both  have  zero 
p-values  if  the  standards  are  outside  the  data's  range; 

(B)  In  starting  with  two  trial  standards  on  opposite  sides 
of  the  data  and  moving  them  inward,  neither  p-value 
changes  until  the  extremes  of  the  data  are  reached; 

(C)  Reaching  the  extreme  data  values  causes  (i)  the  count 
in  step  6  to  increase  from  zero  to  one  and  (ii)  the 
two  p-values  to  increase;  both  become  one  divided  by 
two  raised  to  the  nth  power; 

(D)  The  other  pointa  on  the  number  line  that  change  the 
counting  in  step  6  are  values  of  "standards"  equaling 
(i)  other  data  and  (ii)  pair-wise  averages  of  the  data; 

(E)  The  symmetry  of  the  number  line  and  integer  Intervals 
between  ranks  makes  symmetric  contributions  to  the 
two  p-values  as  the  two  "standards"  are  slid  in  unison 
over  pairs  of  points  identified  in  property  D; 

(F)  Equal  p-values  are  retained  by  crossing  pairs  of 
property  D  points  simultaneously; 

(G)  The  p-vulte  is  reached  when  the  two  "standards"  meet. 


Thus  the  Wilcoxon  p-vulte  median  is  the  sample  median  of  all  the 
pair-wise  averages  of  the  data  including  each  datum  with  itself. 
If  the  sum  of  the  sample  size  and  the  number  of  pair-wise  averages 
(i.e.,  n  +  nl / [2  I {n-2> I ] )  is  odd,  then  the  p-vulte  is  unique.  If 
this  sum  is  even,  the  p-vulte  is  somewhere  between  the  innermost 
pair  of  data  and  pair-wise  averages  on  the  number  line.  Although 
there  as  no  justification,  a  unique  estimate  may  be  obtained  by 
instinctively  defining  it  as  the  average  of  the  innermost  points; 
this  will  be  called  the  "even  estimate". 

Although  the  calculation  of  the  Wilcoxon  p-vulte  median  is 
direct  in  concept,  its  aatual  calculation  needs  a  shortcut  to  be 
practical  for  large  data  sets.  Even  a  modest  sample  size  generates 
a  large  number  of  pair-wise  averages.  Even  medium  size  computers 
can  have  storage  difficulties  if  all  n  +  n 1 / [ 2 1 ( n-2 ) 1 ]  ■  n(n+l)/2 
averages  are  stored  at  once,  bubble  sorted,  and  counted  off  to  the 
middle  value.  Fortunately,  there  is  a  simple  technique  to  avoid 
the  handling  of  this  large  array  of  numbers  * 


1.  Bubble  sort  the 

data 

XI 

X2 

X3 

_ 

Xn 

with  the  lowest 

datum 

at  the  low  end; 

2.  Think  of  a 

XI 

X2 

X3 

mm  m 

Xn 

triangular 

array  of 

XI 

All 

A12 

Al  3  - 

-  - 

Ain 

pair-wise 

X2 

A22 

A23  - 

wm  «M 

A2n 

averages. 

X3 

A33  - 

■a  hm 

A3n 

of  all  data 

- 

- 

- 

including 

- 

- 

- 

each  datum 

- 

- 

aa 

with  itself; 

Xn 

Ann 

3.  View  the  diagonal; 

4.  Construct  and  store  the  averages  on  the  diagonal  and  the 
row  and  column  numbers  needed  to  find  these  averages; 

5.  Bubble  sort  the  diagonal,  discard  the  lowest  average, 
and  replace  it  with  the  next  largest  array  average; 

[Mote i  The  location  of  the  replacement  average  from 

the  discarded  average  Is  either  (a)  immediately 
to  the  right  on  the  same  row  or  (bj  immediately 
down  the  diagonal.  Clearly,  replacement  from 
the  diagonal  necessitates  another  replacement 
before  proceeding  to  the  iteration  of  step  6.] 

6.  Repeat  step  5  until  the  sample  median  of  the  triangular 
array  can  be  found. 

[Note:  For  odd  n(n+l)/2,  discarding  f n ( n-i- 1 ) / 2  -  1]  /  2 
values  makes  the  smallest  value  on  the  remaining 
diagonal  equal  to  the  Wilcoxon  p-vulte  median. 
For  even  n(n+l)/2,  discarding  n(n+l)/4  -  1 
values  makeB  the  average  of  the  two  smallest 
values  on  the  remaining  diagonal  equal  to  the 
even  estimate  of  the  Wilcoxon  p-vulte  median.] 
This  technique  uses  storage  for  the  3n  diagonal  values  and  their 
row  and  column  sources  instead  of  storage  for  the  r.i(n+l)/2  array 
values . 


14.1 


The  sensitivity  and  robustness  of  the  Wilooxon  p-vulte  median 
may  be  illustrated  by  simulations.  Either  graphs  or  tables  may  be 
used  to  display  the  results. 

The  two  graphs  display  results  from  a  simulation  illustration 
of  (1)  how  rapidly  repeated  sampling  yields  convergence  and  (2)  how 
closely  the  convergence  approaches  the  input  parameter.  The  three 
lines  are  all  calculated  from  the  Bame  set  of  simulations.  One 
sample  of  size  eleven  is  simulated  to  find  and  plot  the  average, 
sample  median,  and  Wilooxon  p-vulte  median  at  the  left  (i.e.,  #-l) 
ends  of  the  three  lines.  Each  of  the  following  points  to  the  right 
incorporates  another  simulation  of  a  sample  of  size  eleven;  the 
three  quantities  graphed  are  the  average  of  the  averages,  the 
sample  median  of  the  sample  medians,  arid  the  Wilooxon  p-vulte 
median  of  the  Wilooxon  p-vulte  medians. 

Since  a  uniform  population  between  zero  and  one  is  used  for 
the  simulations,  the  target  value  for  all  three  lines  is  exactly 
one  half.  The  solid  line  traces  the  central  limit  theorem 
prediction  that  the  average  of  averages  from  different  random 
samples  will  approach  the  population  mean.  The  line  with  long 
dashes  traces  the  corresponding  theorem  prediction  that  the  sample 
median  of  sample  medians  from  random  samples  will  approach  the 
population  median.  Finally,  the  line  with  short  dashes  does  the 
analogous  process  with  the  Wilooxon  p-vulte  median. 

The  graph  from  populations  with  no  outliers  shows  that  the 
average  converges  best.  The  Wilooxon  p-vulte  median  does  almost  as 
well;  but  the  sample  median  exhibits  large  excursions.  Thus  the 
average  is  most  sensitive;  the  Wilooxon  p-vulte  median  is  quite 
sensitive;  and  the  sample  median  is  least  sensitive. 

The  graph  from  populations  with  outliers  shows  the  sample 
median  to  converge  best.  The  Wilooxon  p-vulte  median  does  quite 
well;  but  the  average  is  biased  toward  the  weighted  average  of 
(0.95) (0.5)  +  (0*05) (2.5)  ■  0.6.  Thus  the  sample  median  is  most 
robust;  the  average  is  least  robust;  and  the  Wilcoxon  p-vulte 
median  is  bracketed  by  the  sample  median  and  the  average. 

An  analyst  is  never  certain  if  data  has  outliers.  Thus  the 
Wilooxon  p-vulte  median  is  the  best  estimate  of  central  tendency. 

These  graphical  results  need  to  be  repeated  many  times  before 
they  can  be  generalized.  Instead  of  trying  to  compare  many  graphs, 
repeated  simulations  can  be  reported  with  tables. 

Before  preparing  tables,  the  investigation  should  bo  broadened 
to  include  populations  other  than  the  uniform.  After  all,  a 
Gaussian  or  Student's  t  probability  density  function 
would  be  expected  to  have  better  convergence  than  the  uniform. 

For  sensitivity  investigation,  the  sample  variance  of  repeated 
simulations  is  the  quantity  that  its  desired  to  be  minimized. 
Tabulated  results  from  a  sot  of  200  simulated  graphs 
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for  probability  density  functions  (pdf's)  of  triangular  (t), 
uniform  (u ) ,  and  sine  (s)  with  no  outliers  are: 


Number  of 
samples  of 
si 3*  11  in 
estimate 

pdf 

Sample 

variance 

of 

averages 
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variance 
of  sample 
medians 
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0.007 
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0.013 
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0.023 

0.036 

0.008 

0.012 

0.014 

All  of  the  sample  variances  in  the  sample  median  columns  are 
appreciably  larger  than  those  in  the  other  two  columns.  Thus  the 
sample  median  is  the  least  sensitive.  The  sample  variances  in  the 
Wilcoxon  p~vulte  median  column  are  only  slightly  larger  than  those 
in  the  average  column.  Thus  the  Wilcoxon  p-vulte  median  is  almost 
as  sensitive  as  the  average. 

For  robustness  investigation,  the  quantity  of  interest  is  the 
actual  measure  of  central  tendency.  Tabulated  results  from  200 
graphs  when  the  target  is  0.5  and  5  percent  of  the  population  has 
a  bias  of  2.0  are: 
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All  of  the  values  in  the  averages  columns  are  appreciably  further 
from  0.5  than  those  in  the  other  two  columns.  Thus  the  average  is 
the  least  robust.  The  values  in  the  Wilcoxon  p-vulte  median  and 
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sample  median  columns  are  comparable.  Thus  the  sample  median  and 
Wilcoxon  p-vulte  median  are  more  robust  than  the  average. 


The  result  of  the  tabular  investigation  is  thus  the  same 
as  that  of  the  less  extensive  graphical  analysis.  The  Wilcoxon  p- 
vulte  median  is  the  best  estimator  of  central  tendency  when  the 
analyst  does  not  know'  if  outliers  are  present  or  absent. 

6.  PROFESSIONAL  REVIEW  AND  ACKNOWLEDGMENTS.  All  techniques 
must  be  reviewed  and  placed  in  context  with  established  methods. 
This  review  was  provided  by  the  Thirty-Ninth  Conference  on  Design 
of  Experiments  in  Army  Research,  Development,  and  Testing  at  Rice 
University  in  October,  1993. 

Three  important  things  were  recognized  at  the  conference. 
First  and  most  important,  the  p-vulte  technique  is  the  same  as  the 
Hodgea-Lehmann  estimate.  This  pioneering  work  was  reported  with 
great  mathematical  thoroughness  in  Hodges,  J.  L.  Jr.  and  Lehmann, 
E.  (1963):  Estimates,  of  Location  Based  on  Rank  Tests,  Ann.  Math. 
Statist.  M#  598-611.  Second,  Hodges-Lehmann  estimation  is  based 
in  a  philosophy  that  is  not  in  either  mainstream  of  statistical 
methodology.  It  is  neither  frequentist  nor  Bayesian;  it  may  be 
most  properly  described  as  Fisherian.  Third,  it  has  been  applied 
only  to  one  dimensional  data.  In  the  thirty  years  since  Hodges- 
Lehmann  estimation  was  introduced,  statistical  methodology  has  made 
great  advances  in  the  more  productive  area  of  multidimensional 
analysis. 

Many  people  associated  with  the  Conference  on  the  Design  of 
Experiments  in  Army  Research,  Development,  and  Testing  are  deeply 
appreciated  for  their  valuable  contributions.  Long  before  the 
conference,  the  clinical  session  chairperson,  W.  J.  Conover, 
identified  the  Wilcoxon  T  test  zero  percent  confidence  interval  aB 
the  sample  median  of  pair-wise  data  averages.  Also  before  the 
conference,  program  committee  member  Malcolm  Taylor  encouraged  the 
presentation  of  this  paper.  Another  program  committee  member, 
Francis  Dressel  scheduled  this  paper  in  a  clinical  session  where  it 
eventually  received  many  constructive  comments.  Panelist  Bernard 
Harris  recognized  that  the  p-vulte  has  the  statistical  property  of 
consistency.  Panelist  Wei-Yin  Loh  identified  the  Wilcoxon  p-vulte 
median  as  the  Hodges-Lehmann  technique.  Panelist  J.  Sethuraman 
explained  Loh'e  identification  and  also  identified  the  reference  to 
the  original  journal  article  by  Hodges  and  Lehmann.  Program 
committee  member  Gerald  Andersen  specified  the  section  of  the  Rice 
University  library,  where  the  conference  was  physically  held,  that 
had  a  textbook  description  showing  clear  direct  equality  of  the 
Wilcoxon  p-vulte  median 

and  the  Hodges-Lehmann  estimate.  Panelist  Nozer  Singpurwalla 
enunciated  that  the  Hodges-Lehmann  estimate  does  not  utilize 
any  prior  information  in  a  Bayesian  analysis.  Many  other 
conference  participants,  especially  David  W.  Scott  who  taught 
the  tutorial  on  multivariate  density  estimation,  illustrated  that 
multidimensional  techniques  have  wider  applicability  than  the 
single  dimensional  Hodges-Lehmann  estimate. 


APPENDIX.  TIE  BREAKING.  Certain  analyses  such  ao  the 
Wilcoxon  signed  ranks  T+  test  have  difficulties  associated  with 
ties.  There  are  two  types  of  ties.  First,  groups  of  data  may  have 
one  common  reported  value.  Second,  the  standard  may  equal  the 
reported  value  of  a  datum  or  group  of  data.  The  first  of  these  is 
easily  handled  by  assigning  ranks  to  each  group  member  in  a  manner 
that  (1)  does  not  effect  the  ranks  of  other  data  and  (2)  uses  an 
average  rank  within  the  group.  [E.G.,  Assign  rankB  of  1,  2.5,  2.5, 
4,  5,  7,  7,  7,  9,  10,  11,  12  to  the  absolute  values  of  the 
differences  between  the  standard  and  data  equaling  2.05,  3.3,  3.3, 
3.9,  4.5,  5.5,  5.5,  5.5,  6.1,  6.2,  6.3,  14.4.]  The  second  type  of 
tie  is  more  difficult  and  is  discussed  below. 

A  trivial  method  of  handling  a  tie  of  the  standard  with  a 
datum  or  a  group  of  data  is  to  simply  discard  all  zeros  in  the  set 
of  differences  of  the  standard  and  the  data.  Unfortunately,  this 
causes  the  p-value  to  be  non-monotonic  with  the  standard. 

A  more  realistic  method  of  handling  these  zeros  is  to 
recognize,  for  continuous  data,  that  they  really  don't  exist.  They 
appear  to  exist  only  becauce  the  data  were  not  measured  to  a 
sufficient  number  of  significant  digits.  Such  apparent  ties  can  be 
removed  in  a  pre-analysis  of ,  the  data  by  shifting  the  data  away 
from  the  standard.  This  is  analogous  to  the  introduction  of 
"jitter"  into  data  for  computerized  data  viewing  in  multivariate 
density  estimation.  An  analytical  analysis  of  this  shifted  data 
calculates  an  expectation  value  from  all  possible  shifts. 

Obviously  a  probability  density  function  1b  needed  to 
calculate  expectation  values.  Two  possibilities  are  the  binomial 
and  the  uniform. 

A  binomial  pre-analysis  may  be  used  as  a  first  approximation 
in  breaking  of  ties.  For  a  single  tie,  an  apparent  datum  X  may  be 
considered  as  being  above  or  below  the  standard  in  the  following 
picture 


1 

STANDARD  +  DELTA  j- 

X 

STANDARD  j - 

—  APPARENT  X 

STANDARD  -  DELTA  - 

i 

i 

X 

PROBABILITY  OF  SHIFT  * 

1/2 

1/2 

with  the  probability  density  tabulated  under  the  above  picture. 
Two  values  of  the  p-value  are  calculated  from  the  two  possible 
relative  locations  of  the  apparent  datum.  The  p-value 's 
expectation  value  is  the  sum  of  the  products  of  possible  p-values 
and  the  probabilities  of  those  p-values.  Since  the  probabilities 
are  both  1/2  in  this  binomial  pre-analysis  for  a  single  tie,  this 
p-value' s  expectation  value  is  just  the  average  of  the  p-values. 
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The  picture  and  table  of  probabilities  for  two  data,  X  and  Y,  that 
apparently  tie  the  standard  are 


STD  +  DEL 
STD 

STD  -  DEL 


PROB[ SHIFT] i  1/4  1/4  1/4  1/4 

or 

PROB[ SHIFT ] *  1/4  1/2  1/4 


x,y 


■X,  Y 


*'Y 


where  the  second  version  of  the  table  of  probabilities  is  for 
expectation  value  calculations .  (There  is  no  sense  in  calculating 
both  degenerate  p-values  for  the  two  center  shifts.)  In  the 
general  case  of  n  data  that  apparently  tie  the  standard,  binomial 
p;re-analysis  yields  a  p-value's  expectation  value  equal  to  the  sum 
[from  j»0  to  j-n]  of  tne  product  of  (1)  b(j;n,l/2)  and  (2)  the  p- 
value  calculated  from  a  data  set  with  j  data  shifted  to  one  side 
and  n-j  shifted  to  the  other  of  the  standard. 

The  numerical  value  used  in  the  shift  (i.e.,  delta)  does  not 
effect  rank  analysis  results  as  long  as  delta  is  small  compared  to 
the  smallest  separation  between  two  data.  However,  not  all  ties  of 
data  with  itself  are  broken  by  binomial  pre-analysis  when  two  or 
more  data  are  tied  with  the  standard.  This  does  have  a  effect. 
The  effect  of  false  ties  of  measurements  of  a  continuous  random 
variable  can  be  removed  with  uniform  pre-analysis. 


The  uniform  probability  density  function  is  an  appropriate 
description  of  data  that  is  taken  with  digital  meters.  Any  reading 
is  necessarily  rounded  off  to  the  number  of  digits  available  on  the 
meter.  The  meter  cannot  indicate  which  way  or  how  far  the  data 
value  is  off.  Thus  the  analyst  knows  only  that  the  true 
measurement  should  be  somewhere  between  in  the  interval  bounded  by 
data  plus  or  minus  half  the  smallest  unread  digit. 


The  picture  and  table  of  probabilities  for  uniform  pre¬ 
analysis  of  two  apparently  tied  data  X,Y  are 
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1/4 


1/4 


where  the  stared  (i.e.,  * )  non-split  absolute  values  of  differences 
between  the  standard  and  split  data  have  been  deleted  front 


XXX 


XXX 


XXX 


and  the  degenerate  interchanges  of  x  and  y  have  been  removed. 
Since  all  four  of  the  probabilities  in  the  table  are  equal,  the  p- 
value's  expectation  value  is  the  average  of  the  four  p-values 
calculated  from  the  four  sets  of  data  less  X  and  Y  plus  each  of  the 
completely-split  ,jbut  non-degenerate  x  and  y.  Instead  of  pictures, 
tables  can  be  used.  For  n«2,  the  table  is  ' 


DIFFERENCE 
BETWEEN  STANDARD 


SIGNS  IN  4 


2  DELTA  +  +  -  - 

1  DELTA  +  -  + 

where  only  the  information  essential  for  aalaulation  has  been 
retained.  For  n<"4,  the  corresponding  table  is 

DIFFERENCE 
BETWEEN  STANDARD 


4  DELTA 
3  DELTA 
2  DELTA 
1  DELTA 


+  -  +  +  +  _..  +  +  + - -  + 

+  +  -  +  +  -  +  +  --  +  --  +  - 

+  +  +  -  +  +  -  +  -  +  --  +  „- 

+  +  +  +  -  +  +  -  +  --  +  --  - 


where  again  only  the  information  essential  for  calculation  has  been 
retained.  In  the  general  case  of  n  data  that  apparently  tie  the 
standard,  uniform  pre-analysis  yields  a  p-value's  expectation  value 
equal  to  the  sum  [over  to  j  *  the  total  number  of  ways  of 
choosing  0,  1,  — ,  n  positive  signs  for  the  n  differences  between 
data  and  the  standard]  of  the  product  of  (1)  the  reciprocal  on  two 
to  the  nth  power  and  (2)  the  p-value  calculated  from  a  shifted  data 
set  where  all  n  ties  have  been  broken. 

Uniform  pre-analysis  is  obviously  more  complete  and  time 
consuming  than  binomial  pre-analysis.  Both  are  improvements  over 
no  pre-analysis  when  rounding  off  introduces  fictitious  ties. 
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Abstract: 

To  track  the  scope  of  industrial  and  governmental  research  in  a  variety  of  scientific  areas 
of  interest  to  the  Army,  ARO  has  previously  assigned  scientists  to  read  research  reports 
from  these  organizations  and,  based  on  the  scientists'  knowledge,  to  classify  the  research 
according  to  a  system  of  classification  categories  that  correspond  to  Army  technologies 
and  operational  functions,  To  avoid  this  highly  labor  intensive  effort,  which  needs  to  be 
updated  annually  or  bi-annually,  an  algorithm  has  been  developed  that  automates  this 
classification,  The  algorithm  performs  the  classification  based  only  on  the  aggregate  of 
words  that  are  used  in  the  research  report,  as  calibrated  based  on  previous  classifications 
which  had  been  performed  manually, 


Introduction: 

As  part  of  its  ongoing  efrbrt  to  ensure  the  mission  relatedness  of  its  basic  research 
program,  and  to  provide  the  Army  with  guidance  in  its  technical  base  programs,  ARO 
attempts  to  keep  traok  of  research  in  progress  in  industry. 

A  particularly  convenient  window  on  industry  has  been  the  industry  reports  prepared  in 
connection  with  the  Independent  Research  And  Development  (IRAD)  program.  The 
IRAD  program  provides  miyor  DOD  contractors  with  funding  that  they  can  use  to 
perfbrm  R&D  of  their  own  choosing,  In  return,  the  contractors  have  been  required  to 
provide  reports  to  DOD  describing  the  research.  Until  recently,  contractors  have  also  been 
required  to  provide  on-site  reviews  every  3  years  to  put  portions  of  their  research  on 
display  for  interested  government  representatives, 

In  past  years,  ARO  scientists  have  examined  the  written  reports  and  attended  some  of  the 
on-site  reviews,  and  have  manually  compiled  a  data  base  that  summarizes  and  categorizes 
the  research  according  to  the  miyor  Army  functions  that  the  research  supports,  Examples 
of  such  functions  are  logistics,  mobility,  vulnerability  reduction,  NBC  protection,  target 
acquisition,  lethality,  C3I,  and  ECM/ECCM,  and  their  various  subfUnctions,  ARO  has  also 
tracked  research  that  support  mqjor  technical  areas  such  as  electronics,  materials, 
manufacturing  technology,  computers/computer  science,  and  space-related  technologies, 
also  broken  down  by  their  various  sub  areas, 

For  convenience  in  organizing  the  data,  ARO  has  assigned  alphnumeric  labels,  called 
descriptors,  to  designate  the  various  research  categories,  Previously,  the  ARO  descriptors 
have  been  assigned  manually  to  each  of  the  IRAD  projects  by  ARO  scientists  through  a 
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tedious,  labor  intensive  effort  of  reviewing  all  of  the  IRAD  reports  generated  by  the 
participating  industrial  agencies. 

Recently,  Congress  has  mandated  a  number  of  changes  to  the  IRAD  program.  Among 
these  changes  are  the  reporting  requirements  for  industry.  On-site  reviews  are  now 
optional  on  the  part  of  each  company,  and  are  less  formal,  Also,  written  reports  are  now 
to  be  compressed  to  a  maximum  of  5  pages  per  research  project,  and  are  to  be  reported 
annually  to  DTIC  for  inclusion  on  a  CD  ROM  that  DTIC  will  make  available  to  DOD 
agencies, 

Details  of  the  information  to  be  included  on  CD  ROM,  and  formats  for  the  information, 
are  still  in  flux,  However,  these  are  expected  to  include  project  titles,  company/proflt 
center  names,  narrative  descriptions,  and  funding  levels,  Abstracts  and  keywords  may 
possibly  also  be  included,  although  these  are  currently  among  the  items  under  negotiation 
between  industry,  DTIC,  and  other  interested  DOD  agencies, 

• 

In  one  respect,  the  new  rules  make  it  more  difficult  to  keep  track  of  what  induntry  is  doing 
in  its  IRAD  efforts,  as  a  result  of  the  shortening  of  the  written  reports  and  the  tievere 
reduction  in  on-site  reviews,  On  the  other  hand,  the  availability  through  DTIC  of 
computerized  reports  promises  to  enhance  ARO's  capability  to  keep  track  of  industry 
IRAD  efforts,  In  particular,  it  opens  the  possibility  of  automating  the  process  of 
categorizing  the  IRAD  efforts  by  Army  (Unctions  supported,  thereby  greatly  reducing  the 
highly  time  consuming  demands  on  ARO  scientists  who  have  previously  performed  this 
categorizing  manually. 

What  follows  is  an  interim  report  on  the  development  of  a  computerized  technique  for 
automating  this  categorization  based  on  data  supplied  by  DTIC,  and  on  information 
derived  from  manually  generated  categorizations  performed  in  previous  yearn. 

It  should  be  noted  that  many  of  the  IRAD  projects  are  inherently  structured  so  as  to 
support  more  than  one  Army  function.  Categorization  by  ARO  descriptor  is  therefore  in 
most  cases  largely  a  technical  judgement  call,  and  even  in  principle  can  be  correct  only  to 
within  broad  tolerances.  Moreover,  the  very  definitions  of  the  Army  functions  and, 
particularly,  of  the  contributing  technologies  are  inherently  somewhat  fUzzy  in  their 
definitions,  and  constantly  changing.  Thus,  the  relationship  between  ongoing  IRAD 
projects  and  Army  functions  that  they  tend  to  support,  is  at  best  imprecise  and  changing, 
even  in  principle,  so  that  any  technique  for  relating  project  with  functions  would 
necessarily  be  somewhat  imprecise  and  changing, 

Nevertheless,  Army  managers  need  basic  information  as  to  how  the  ongoing  IRAD 
projects  and  the  Army’s  own  RDT&E  tech  base  program  tend  to  support  the  Army’s 
functional  needs,  even  if  such  information  may  be  less  than  folly  precise  and/or  stable  over 
time,  It  would  appear,  therefore,  that  categorizing  IRAD  (and  Army  tech  base)  projects 
according  to  ARO  descriptors  promises  to  be  usefUl  to  Army  management  for  addressing 
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policy  questions  relating  to  ftinding  the  Army  tech  base,  even  if  the  categorizations  are 
performed  only  to  within  some  residual  errors. 

Approach;  Data 

Available  data  for  automating  the  categorizations  consist  of  two  parts. 

The  first  part  consists  of  the  sample  of  data  that  had  been  categorized  manually  in 
previous  years  (i.e.,  the  historic  data).  This  consists  of  written  IRAD  reports,  supplied  by 
industry,  of  which  the  project  titles  and  industry-supplied  keywords  are  used,  together 
with  ARO-generated  characterizations  by  ARO  descriptor. 


The  second  part  consists  of  the  DTIC-supplied  data  that  represent  the  current  IRAD 
projects  which  are  to  be  classified. 

For  the  purposes  of  algorithm  development,  only  the  historic  data  is  available.  DTIC  data 
for  current  projects  is  being  collected  from  industry,  and  is  being  loaded  and  formatted 
onto  CD  ROMs  by  DTIC,  but  Is  not  yet  available, 

Approach*  Model 

The  categorization  algorithm  is  based  on  a  mathematical  model,  developed  as  follows: 

Each  IRAD  report  (historic  or  current)  contains  several  data  fields  that  will  be  analyzed. 
The  entries  in  the  data  fields  will  be  broken  into  words.  In  this  way,  each  project  report 
will  generate  an  aggregate  of  words  that  have  been  taken  from  the  various  data  fields,  and 
the  aggregate  of  words  so  obtained  will  be  regarded  as  collectively  representing  the 
project.  The  words  will  be  used  as  a  basis  for  categorizing  the  project  and  associating  it 
with  an  ARO  descriptor.  To  do  this,  the  procedure  must  first  be  calibrated  based  on 
historic  projects. 

Mathematically,  the  collection  of  words  derived  from  all  of  the  historic  projects  (i.e.,  those 
used  to  calibrate  the  model)  will  define  a  multi-dimensional  mathematical  space.  In  this 
space,  each  word  corresponds  to  one  of  the  dimensions,  and  vice  versa.  For  convenience, 
call  this  space  word  space. 

Now  consider  an  arbitrary  historic  project,  This  project  defines  a  vector  in  word  space,  as 
follows;  Each  coordinate  has  the  value  1  if  the  corresponding  word  appears  in  the  word 
aggregate  for  that  project,  and  has  the  value  0  if  it  does  not  appear.  That  is,  the  coordinate 
indicates  whether  or  not  the  word  appears  at  least  once  in  the  project  report  (or,  more 
accurately,  in  those  data  fields  of  the  report  which  are  used  in  the  analysis). 

Next  consider  those  vectors  derived  from  projects  that  have  a  given  ARO  descriptor 
assignment,  The  average  of  those  vectors  will  be  used  to  represent  that  particular  ARO 
descriptor,  A  descriptor  vector  VD  corresponding  to  an  ARO  descriptor  D  therefore  has 
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the  following  simple  meaning;  its  typical  component  Vdw  corresponding  to  a  word  W  in 
the  calibration  word  set  measures  the  observed  fraction  of  projects  with  descriptor  D  in 
which  word  W  appears  at  least  once,  Each  of  the  VDW  will  therefore  be  between  0  and  1. 

Having  completed  the  calibration  using  historic  data,  consider  the  problem  of  assigning 
ARO  descriptors  to  a  set  of  current  projects,  To  do  this,  choose  an  arbitrary  project  with 
vector  X  where  component  Xw  ■  1  or  0  according  to  whether  word  W  appears  at  least 
once  in  the  project  writeup  or  not. 

Now  define  a  metric  in  word  space,  It  might  seem  sufficient  to  use  the  metric 


H„  -  Z  |(XW  ■  VDW)| 
w 

to  represent  the  distance  between  the  vector  X  and  a  typical  descriptor  vector  VD, 

There  is,  however,  a  problem  that  will  cause  this  metric  to  need  some  modification. 

As  defined  above,  HD  gives  equal  weight  to  all  words  in  the  calibrated  data  base, 

However,  some  words  clearly  have  better  discriminating  power  than  others,  and  this  needs 
to  be  reflected  in  the  definition, 

To  see  how  this  comes  about,  consider  the  two  ways  that  words  can  fhil  to  discriminate, 
One  way  is  for  a  word  to  appear  in  almost  all  project  writeups.  Such  common  words  as;  a, 
the,  it,  of,  with,  and, . . .  would  clearly  fhll  to  help  to  identify  projects  as  to  their  content, 

So  also  would  words  such  as;  advanced,  novel,  ...  and  others  which  seem  to  find  their 
way  into  most  project  writeups, 

The  second  way  is  for  u  word  to  appear  only  once,  or  at  most  a  very  few  times,  so  that  the 
word  is  likely  random  and  thus  not  associated  strongly  with  the  project  writeup’s  content. 
Typographical  errors  might  fell  into  this  category, 

In  the  first  case,  the  components  Vdw  for  the  given  word  W  will  be  close  to  1,  for  all 
descriptors  D;  in  the  second  case,  the  components  will  be  close  to  0  for  all  D.  It  follows 
that,  for  word  W  to  be  a  good  discriminator  among  the  D  requires  that  the  VDw  vary 
widely  over  the  D,  To  reflect  this,  modify  the  definition  of  the  metric  Hd  as  follows; 

Hr  ■  £|(Xw  “  vDw)|gw 

w 

where  Gw  is  a  weighting  function  defined  as 

Gw  31  max  (VDW)  ”  min  (VDW) 

D  D 
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Using  this  metric,  the  project  is  assigned  descriptor  D  for  which  the  metric  is  smallest. 

In  feet,  one  can  also  keep  track  of  the  smallest,  second  smallest,  third  smallest,  etc.  Using 
these,  there  are  several  possible  interpretations.  One  is  the  assignment  of  probabilities  that 
D  is  correctly  assigned,  Another  is  that  the  descriptor  is  to  be  apportioned  (by  funding 
level,  perhaps)  according  to  the  various  D  in  some  way.  Yet  another  is  a  fUzzy  set 
interpretation  that  assigns  partially  to  each  of  the  D,  yet  not  necessarily  requiring  partial 
assignments  to  add  to  1 .  The  question  of  how  best  to  perform  such  an  interpretation  is  at 
this  point  an  open  question. 


Definition  of  “Word” 

Although  the  foregoing  model  suffices,  in  principle,  to  define  an  assignment  procedure 
that  can  be  automated,  there  is  yet  one  more  refinement  to  add, 

The  refinement  has  to  do  with  what  it  is  that  constitutes  a  word.  In  one  sense,  the  matter 
is  easily  settled.  A  word  is  simply  any  string  of  characters  (not  itself  containing  a  space) 
between  two  spaces.  The  problem,  however,  is  more  subtle. 

Consider,  for  example,  the  words  "optic,"  "optics,"  "optical,"  and  "optically."  These 
might  appear  to  be  four  distinct  words,  and  would  according  to  the  above  model  be 
treated  as  four  distinct  words.  Nevertheless,  the  words  are  very  similar  semantically,  and 
for  maximum  reliability  in  assigning  ARO  descriptors  be  treated  as  one  word. 

There  are,  in  principle,  several  ways  to  do  this.  One  could,  at  great  effort  and  expense, 
compile  a  table  of  all  English  words,  augmented  by  all  technical,  governmental,  and 
military  terms,  and  assign  them  to  a  subset  of  "root”  words.  Another  way,  at  perhaps  even 
greater  expense,  would  be  to  develop  rules  of  English  by  which  one  could  constructively 
make  the  assignments.  A  much  simpler  way,  though  only  approximate,  is  to  truncate  all 
but  the  first  k  characters,  where  k  is  a  parameter  to  be  determined.  This  is  the  only 
practical  method,  as  is  the  one  that  will  be  used. 

A  problem,  though,  is  how  to  best  choose  the  truncation  length  k,  If  k  is  large,  there  is  no 
truncation  and  therefore  semantically  equivalent  variants  of  a  single  word  will  tend  to  be 
treated  as  distinct,  as  in  the  example  above.  If  k  is  too  small  (e.g.,  k  -  1)  then  words  that 
are  semantically  very  different  will  tend  to  be  treated  as  identical.  This  is  also  incorrect. 
The  best  value  of  k  will  therefore  lie  between  the  extremes.  To  find  the  best  value,  tests 
were  run  based  on  a  subset  of  the  fUll  data  base. 

It  turned  out  that  k  was  essentially  flat  between  k  -  4  and  k  *  7,  Outside  these  values,  the 
assignments  became  progressively  erratic,  However,  k  »  3  was  not  very  much  less  reliable 
than  k  -  4, 
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It  seems,  bated  on  the  test  rum,  that  truncating  to  the  first  4  characters  produces  the  best 
results,  at  well  as  reduces  the  size  of  the  problem  after  calibration,  and  also  the 
computation  times  for  both  calibration  and  descriptor  assignment, 

In  itself;  it  seems  both  a  surprising  and  counterintuitive  result  that  a  value  as  small  as  k  -  4 
should  work  as  well  as  it  does, 


Preliminary  Results 

'  •  i  , 

Test  cases  were  run  using  small  subsets  of  historic  data  to  calibrate  the  algorithm.  The 
same  subsets  of  data  were  used  as  test  data,  to  be  assigned  to  ARO  descriptors.  Subsets, 
rather  than  the  fUll  historic  data  set,  were  used  in  order  keep  the  test  computations  within 
a  feasible  computing  time  fbr  test  purposes,  This  procedure,  in  which  calibration  and 
assignment  data  are  the  same,  will  of  course  generate  test  results  that  may  be  distinctly 
optimistic.  As  a  teat  of  feasibility,  however,  the  procedure  serves  as  a  reasonable  indicator 
in  the  sense  that,  if  the  test  results  are  poor  when  generated  in  this  way  then  it  is  unlikely 
that  the  algorithm  can  be  made  to  work  under  realistic  conditions. 

The  fUll  historic  data  base  consists  of  5915  projects  that  extend  over  about  540  ARO 
descriptors.  Test  cases,  randomly  selected  from  the  frill  data  base,  have  consisted  of  up  to 
about  245  projects. 

Tentative  results,  generated  In  this  way,  tend  to  show  an  assignment  reliability  of  98%  or 
greater.  This  is  dearly  too  optimistic  to  expect  under  realistic  conditions,  but  at  least 
demonstrates  the  fbasibility  of  the  procedure,  It  Is  to  be  understood  that  the  research  is 
ongoing,  and  that  the  results  reported  here  are  only  a  first  cut,  and  are  to  be  regarded  as 
tentative  and,  as  noted  above,  biased  toward  an  optimistic  outcome. 

Computation  times  for  the  algorithm  have  required  up  to  60  hours  on  a  486/50  PC,  using 
a  Turbo  Pascal  implementation  of  the  algorithm,  The  principal  reason  for  the  large 
computing  time  has  to  do  with  memory  limitations,  The  calibration  matrix  V  and  other 
calibration  data  require  too  much  memory  to  be  kept  in  RAM,  The  algorithm  was 
therefore  implemented  in  such  a  way  that  the  calibrated  data  was  stored  on  disk.  The 
classification  procedure  therefore  required  a  large  number  of  hard  disk  accesses,  which  are 
slow  and  which  consumed  the  overwhelming  portion  of  the  total  running  time, 

A  production  version  of  the  algorithm  will  attempt  to  reduce  the  number  of  hard  disk 
accesses  that  are  needed, 

Moreover,  it  has  been  observed  that:  the  definitions  of  the  historic  ARO  descriptors  had 
been  chosen  in  a  way  that  can  be  improved  in  two  important  respects:  (1)  descriptors  can 
be  eliminated  or  combined  where  there  are  found  to  be  few  or  no  project  entries;  and  (2) 
descriptors  can  be  re-defined  to  reduce  or  remove  potential  ambiguities  in  the  way  that 
they  are  likely  to  be  assigned.  A  task  is  currently  in  progress  to  re-define  the  descriptors 
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accordingly,  and  this  has  largely  been  done.  The  revised  list  contains  only  ISO  ARO 
descriptors,  as  compared  with  540  previously.  This  is  expected  to  sharply  reduce  the 
memory  requirements  and  computation  time,  and  to  significantly  improve  the  reliability  of 
the  results. 

Tests  are  in  process  but  results  are  not  yet  available. 


Implementation  Problems 

Given,  eventually,  a  successful  test  implementation  with  the  fUll  historic  data  base,  using 
the  tevised  ARO  descriptors,  there  will  remain  several  problems  that  will  st  ill  need  to  be 
addressed.  These  include  the  following: 

Even  as  revised,  the  ARO  descriptors  may  not  be  optimal.  In  principle,  there  probably 
exists  some  kind  of  a  natural  clustering  of  the  research  projects  that  implies  and 
corresponds  to  some  optimal  set  of  descriptors.  The  identification  of  such  clustering  and 
the  associated  descriptor!  remains  to  be  done, 

The  assignment  procedure  depends  upon  the  existence  of  a  number  of  calibrated  vectors 
vp  that  represent  the  various  descriptors  D,  Among  the  IRAD  projects,  however,  will  be  a 
number  which  may  be  of  Intercast  to  the  Navy  or  to  the  Air  Force,  but  which  are  not 
applicable  to  Army  Amotions,  Corresponding  to  these,  there  will  be  no  ARO  descriptor, 
except  fbr  the  default  that  identifies  them  as  “Not  Applicable,”  or  NA,  Unlike  the  other 
descriptors,  each  of  which  represents  projects  with  some  common  body  of  technologies 
and  applications,  the  NA  descriptor  represents  a  broad  collection  of  projects  with  little  in 
common.  The  vector  vD  that  corresponds  to  descriptor  NA  will  therefore  not  be  “close”  to 
typical  vectors  of  NA  projects.  Typically,  then,  NA  projects  will  appear  to  be  closer  to 
other  descriptor  vectors,  and  the  projects  will  therefore  tend  to  be  misidentifled,  It  may  be 
possible  to  filter  most  of  the  NA  projects  by  requiring,  for  a  project  to  be  identified  with  a 
descriptor,  not  only  that  the  descriptor  vector  be  closest  to  the  project  vector,  but  that  the 
distance  between  them  not  exceed  some  empirically  determined  threshold. 

From  one  year  to  the  next,  technologies  change  and  Army  functions  (thus  ARO 
descriptors)  also  change.  The  calibration  that  was  valid  fbr  last  year  will  therefore  not  be 
fUlly  valid  this  year.  Annual  maintenance  of  the  calibration,  both  as  to  technologies  and  as 
to  Amy  functions,  needs  to  be  addressed. 


Other  Potential  Applications  of  the  Methodology 

If  successful,  the  approach  used  here  to  classify  IRAD  projects  with  respect  to  their  Army 
functional  relevance,  as  measured  by  ARO  descriptors,  might  also  be  applied  to  other  and 
unrelated  problems.  Representative  examples  might  include: 
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In  linguistics,  it  might  be  interesting  to  use  this  approach  to  study  the  semantic  content  of 
words,  parts  of  words,  and  sequences  of  words.  The  empirical  observation,  alluded  to 
above,  that  some  significant  semantic  content  is  embodied  in  as  few  as  the  first  3 
characters  of  a  word,  seems  relevant. 

In  a  related  spirit,  one  might  use  this  approach  to  study  the  psychology  of  how  we  humans 
organize  and  perceive  and  understand  language,  A  simple  version  of  such  a  study  might 
take  the  form  of  presenting  readers  with  standard  English  text,  with  all  words  truncated  to 
no  longer  than  k  characters,  for  various  k,  and  to  observe  the  kinds  of  difficulties  that  the 
readers  have  in  interpreting  the  truncated  text, 

In  literature,  forensics,  history,  and  military  intelligence,  there  arise  questions  of  who 
wrote  what,  The  approach  used  here  might  provide  a  useful  approach  in  cases  where 
literary  samples  are  available  from  each  of  the  candidates  for  authorship  attribution,  and 
the  question  were  that  of  identifying  the  actual  author, 
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ABSTRACT .  Mammalian  sperm  develop  distinctive  motion  patterns 
during  Capacitation  known  as  hyperactivated  motility.  Many 
studies  now  point  to  an  association  between  hyperactivation  and 
in  vitro  fertilization.  A  method  for  the  objective  determination 
of  hyperactivation  is  sought  as  a  tool  for  the  clinical  assess¬ 
ment  of  fertility,  and  as  a  marker  for  the  investigation  of  sperm 
function.  Hyperactivated  motility  is  characterized  by  a  change 
from  progressive  movement  to  a  highly  vigorous,  nonprogressive 
random  motion.  Historically,  the  determination  of  the  level  of 
hyperactivated  motility  has  been  based  on  the  visual  inspection 
of  the  cell's  path  as  recorded  on  film— an  extremely  lengthy 
process  for  a  sample  containing  hundreds  of  cells.  Recent  advanc¬ 
es  in  videomicrography  allow  the  cell  locations  to  be  tracked  by 
computer  systems  which  record  many  motility  characteristics  for 
each  cell  (e.g.,  the  straight  line  velocity).  In  this  presenta¬ 
tion  we  will  discuss  the  application  of  statistical  classifica¬ 
tion  supporting  the  automated  discrimination  between  hyper¬ 
activated  and  non-hyperactivated  cells  on  the  basis  of  their 
motility  characteristics.  We  will  also  preview  on-going  work 
where  tne  proportion  of  hyperactivated  cells  determined  by  the 
classification  rule  is  used  as  a  response  in  assessing  the 
toxicological  effect  of  certain  metals. 


1 .  INTRODUCTION,  This  work  centered  on  the  establishment  of 
an  automated  procedure  to  classify  rabbit  sperm  cells  as  to  their 
motility,  hyperactivated  or  non-hyperactivated.  In  Figure  1  we 
show  the  digitized  representation  of  the  swimming  paths  or  tracks 
of  several  cells.  Hyperactivated  motion  is  described  as  a  highly 
vigorous,  nonprogressive,  random  motion  (e.g.,  cell  tracks  23, 

27,  16  and  20  of  Figure  1)  .  Hyperactivation,  is  the  process  of 
developing  from  a  linear  progressive  motion  (e.g.,  cell  tracks 
21,  41,  9,  and  12  of  Figure  1)  to  hyperactivated  motion.  The 
interest  in  hyperactivation  is  that  it  has  been  found  to  be 
strongly  associated  with  capacitation,  the  biochemical/bio¬ 
physical  changes  a  cell  undergoes,  enabling  it  for  fertilization 
(Tesarik  et  al. ,  1990).  Whereas  the  components  of  capacitation 
are  not  easily  measured,  the  cell  motions  can  be.  Motility 
classification,  supported  by  these  measures,  has  potential  as  a 
marker  for  capacitation. 
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Great  improvements  have  been  made  in  recent  years  in  quantify¬ 
ing  cell  motions.  Several  systems  are  now  on  the  market  to 
support  computer  assisted  videomicrography.  This  technology 
provides  a  VCR  tape  of  the  swimming  motions  of  cells,  a  digitized 
record  of  cell  motion,  and  motility  parameter  values  for  each 
cell.  With  this  automation,  it  is  much  faster  to  characterize 
cells  in  terms  of  their  motility  parameter  values — a  process  once 
carried  out  by  hand  as  film  frames  were  successively  projected  on 
a  grid.  The  task  in  this  study  was  to  take  cell  tracks 
from  previously  determined  hyperactivated  or  non-hyperactivated 
cells  and  establish  a  rule  for  classification  based  on  these,  now 
easy  to  establish,  motility  parameter  values. 

2,  ESTABLISHING  MOTILITY  PARAMETERS.  The  motility  parameters 
of  322 “hyperactivated  sperm  obtained  by  incubation  of  sperm  from 
four  rabbits  under  the  capacitation  conditions  of  Bracket  and 
Elephant  (1975),  and  899  non-hyperactivated  sperm  incubated  for 
one  or  two  hours  in  T  medium  were  chosen  for  the  statistical 
analysis.  The  hyperactivated  sperm  population  contained  the  major 
types  of  hyperactivated  motion  noted  in  the  literature.  The 
parameters  under  study  were  VSL  [velocity  over  a  straight-line 
path  (pm/sec) ] ,  VCL  [velocity  over  a  curvilinear  path  (pm/sec)], 
VAP  (velocity  over  a  smoothed  curvilinear  path;  5-point  moving 
average  (pm/sec) J,  LIN  (VSL/VCL),  STR  (VSL/VAP) ,  WOB  (VAP/VCL) , 
AALH  [average  amplitude  of  the  lateral  head  displacement  (pm)], 
MALH  [maximum  amplitude  of  the  lateral  head  displacement  (pm) j , 
BCF  [beat  cross  frequency  (Hz)],  Dance  [AALH/LIN  (pm)],  and  Dance 
Mean  [VCL*AALH  (pm2/sec)  ]  . 

3.  PREVIOUS  CLASSIFICATION  MODELS.  Others  have  attempted 
to  use  the  motility  parameter  values  for  classification  (e.g., 
Mortimer  and  Mortimer,  1990;  Burkman,  1991)  with  reasonable 
success.  A  potential  for  further  analysis  was  suggested 
because  1)  LIN,  a  key  measure  in  the  decision,  would  in  some 
cases  be  misleading,  and  2)  there  was  opportunity  to  employ 
more  sophisticated“means  of  statistical  classification. 

The  first  issue  was  the  reliance  in  decision  rules  on  LIN.  In 
Figure  2,  four  possible  tracks  are  given  with  the  associated 
values  for  VCL,  LIN,  WOB,  and  AALH.  From  Mortimer  and  Mortimer, 
high  values  (>  0.60)  for  LIN  are  indicative  of  a  non- 
hyperactivateH  or  linear  progressive  motion,  and  low  values 
(<  0.60)  indicate  hyperactivated  motion.  Figure  2a, b  show  non- 
hyperactivated  and  hyperactivated  motions,  respectively,  where 
the  values  for  LIN  are  consistent  with  the  rationale  for  its  use 
(i.e. ,  when  VSL  and  VCL  are  different,  departure  from  linear 
progressive  motion  is  present.)  Figure  2c, d  show  non- 
hyperactivated  motions  where,  because  of  the  looping  path  of 
the  cell,  the  values  of  LIN  are  in  the  range  for  hyperactivated 
motion.  Thus,  the  measure  LIN  will  in  some  instances  mislead. 


A  second  issue  was  in  the  classification  methods  employed. 
No  article  in  the  biological  literature  suggested  using  tradi¬ 
tional  statistical  methods  for  classification.  Techniques 
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Figure  2.  Motility  parameter  measurements  with  accompanying  cell 
path  display  for  motion  types;  a)  non-hyperactivated, 
b)  hyperactivated,  and  c-d)  non-hyperactivated. 


employed  were/  for  example,  comparing  the  univariate  relative 
frequency  histograms  associated  with  the  hyperactivated  and  non- 
hyperactivated  cells,  using  t-tests  to  test  the  difference 
between  mean  values  from  the  two  groups,  and  examination  of 
summary  statistics  from  each  group.  The  actual  decision  rule 
suggested  by  Mortimer  and  Mortimer  would  have  required  a  cell  to 
satisfy  each  of  three  constraints:  VCL  >  100  pm/sec,  LIN  <  0.60, 
and  AALH  >  5  pm.  Each  were  determined  individually. 

4.  GRAPHICAL  ANALYSIS.  Classification  potential  was  assessed 
graphically  by  comparing  the  motility  parameter  distributions  for 
hyperactivated  and  non-hyperactivated  cells.  Figure  3  shows  , 
unmodified  parallel  boxplots  of  the  hyperactivated  (H)  and 
non-hyperactivated  (N)  class  distributions  for  each  motility 
parameter.  Data  were  normalized  to  support  examination  of  the 
relative  classification  potential  among  motility  parameters.  For 
LIN  and  WOB,  at  most  25%  of  the  non-hyperactivated  cells  show 
values  in  the  same  range  as  those  of  the  hyperactivated  class, 
indicating  that  both  have  strong  potential  for  classification. 
Based  on  the  degree  of  separation  for  the  inner  50%  of  the  data, 
it  is  likely  that  AALH,  MALH,  and  VCL*AALH  would  also  be  reason¬ 
able  classifiers.  Note  that  a  classification  rule  based  on  VC 
alone  did  not  appear  promising. 

The  relative  frequency  distributions  for  LIN,  VCL,  AALH,  and 
WOB  are  given  for  each  motility  class  in  Figure  4.  LIN,  VCL,  and 
AALH  were  selected  for  display  because  of  their  prominence  in  the 
literature  (2-4),  and  WOB  for  its  importance  in  this  study. 
Hyperactivated  cells  were  absent  in  the  range  (0.8  -  1.0)  for 
both  LIN  and  WOB,  and  conversely  high  percentages  of  non- 
hyperactivated  cells,  LIN,  75.5%  and  WOB,  94.3%  were  found  over 
this  range.  This  strongly  suggests  good  classifying  potential  for 
each.  AALH  shows  only  minimal  distribution  overlap.  VCL  has 
considerably  more.  The  individual  concomitants  of  hyperactivation 
suggested  by  Mortimer  and  Mortimer  are  reasonably  consistent  with 
these  results  despite  the  fact  that  rabbit  sperm,  not  human 
sperm,  values  are  reported  here. 

As  a  starting  point  for  improvement,  the  rules  suggested  by 
Mortimer  and  Mortimer  were  implemented  on  our  data.  The  results 
appear  as  Figure  5.  In  Fig  re  5a  it  can  be  seen  that  cells  satis¬ 
fying  the  VCL  and  AALH  constraints  (partitions  have  been  over¬ 
laid)  for  hyperactivated  motion  are  very  likely  hyperactivated, 
but  a  good  number  of  cells  not  satisfying  those  constraints  are 
also  hyperactivated.  VCL  and  AALH  are  linearly  associated.  In 
Figure  5b  all  three  conditions  are  shown.  Again,  a  number  of 
hyperactivated  cells  do  not  meet  the  decision  criteria.  Of  course 
the  rules  are  being  implemented  on  a  species  for  which  they 
were  not  intended.  Further  investigation  of  rules  based  on 
these  parameters  and  our  data  was  warranted. 
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Figure  3.  Parallel  boxplots  of  motility  parameter  measures  fo 
hyperactivated  (H)  and  non-hyperactivated  cells. 
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Figure  4., Relative  frequency  distributions  for  a)  LIN,  b) AALH , 
c)  VCL,  and  d)  WOB  under  hyperactivated  (H)  and  non- 
hyperactivated  (N)  motility. 
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5,  DISCRIMINANT  ANALYSIS.  Discriminant  analysis  and  regres- 
sion,  on  a  (0,1)  class  variable/  were  used  to  explore  models  for 
classification.  All  possible  subsets  regression  was  used  to 
select  the  best  models  for  each  number  of  independent  variables. 
BMDP  programs  supporting  stepwise  discriminant  analysis  and 
regression  were  used  in  the  analysis.  Table  1  lists  the  results 
for  individual  motility  parameters,  the  best  two-parameter 
models,  and  the  best  tjhree-parameter  models.  Though  many  models 
perform  well,  it  is  clear  that  WOB  is  the  key  motility  parameter. 
Model  11,  based  on  WOB  and  VCL,  was  judged  to  be  the  best.  It  had 
the  highest  efficiency,  and  included  motility  parameters  which 
were  not  strongly  linearly  associated,  correlation  -0.47. 
(Interestingly,  multiple  correlation  made  the  use  of  AALH,  LIN, 
and  VCL  together  an  undesirable  choice  from  a  prediction 
standpoint.)  The  discriminant  form  of  the  model  was  selected  but 
did  not  differ  markedly  from  the  regression  model.  Cells  were 
classified  as  hyperactivated  if 

WOB  <  0.596  +  VCL  *  (6.76  *  10‘4)  . 

Jackknifed  cross  validation  procedures  using  BMDP  software 
reinforced  this  model  selection. 

6.  CART  MODEL.  Tree-structured  methods  were  also  used  in 
developing  a  model  (CART™,  version  1.1,  1985  California 
Statistical  Software,  Inc.)  The  CART  routine  offers  many  options/ 
only  the  defaults  were  used.  Generally,  CART  works  as  follows  for 
univariate  partitions.  Each  possible  predictor  variable  (motion 
parameter)  for  class  is  examined  individually.  For  a  specific 
variable,  the  program  searches  over  all  the  values,  resting  at 
each  one  to  see  how  efficient  it  would  be  to  partition  the  data 
into  hyperactivated  and  non-hyperactivated  classes  based  on  that 
value.  (In  our  data  set  this  requires  over  1200  assessments  of 
efficiency  for  each  variable.)  The  routine  notes  the  best  value 
for  that  variable  based  on  classification  efficiency.  The  vari¬ 
able  which  partitions  the  data  in  the  most  efficient  manner  is 
selected  and  its  value  is  used  as  the  first  partition  of  the 
data,  creating  two  nodes,  one  each  for  hyperactivated  and 
non-hyperactivated  classes.  Within  each  one,  some  cells  may  be 
misclassif ied.  The  routine  then  searches  among  the  variables 
looking  to  further  partition  the  two  nodes  to  increase  efficien¬ 
cy.  The  routine  eventually  settles  on  a  decision  tree  for  classi¬ 
fication  with  maximum  efficiency  subject  tp  the  constraint  that 
the  tree  complexity  should  not  be  great. 

In  running  CART,  all  the  motility  parameters  considered 
earlier  as  possible  predictors  were  included.  The  result  was  that 
CART  chose  only  WOB  and  VCL,  with  the  rule:  classify  as  hyper¬ 
activated  if 

VCL  >  51  and  WOB  <  0.73. 

The  decision  tree  is  illustrated  as  Figure  6.  Of  the  1221  cases 
examined,  only  12  non-hyperactivated  cells  and  2  hyperactivated 
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Table  1 .  Summary  ot'  best  models  using  discriminant  (D)/regression  (R)  analysis 
based  on  322  hyperactivated  (H)  and  899  nonhyperactivated  (N)  cells 


Misclassified  (No.) 

Efficiency  (%) 

Model 

Variables 

H  (D/R) 

N  (D/R) 

D/R 

R2 

1 

WOB 

22/31 

19  /  15 

96.64  /  96.23 

0.838 

2 

UN 

21/34 

63/46 

93.12  /  93.45 

0.702 

3 

AALH 

59  /  91 

5/4 

94.76  /  92.22 

0.639 

4 

MALH 

63  /  109 

16/9 

93.53  /  90.34 

0.600 

5 

VCL*AALH 

113  /  188 

4/0 

90.42  /  84.60 

0.411 

6 

STR 

132  /  171 

78/36 

82.80  /  83.05 

0.356 

7 

VCL 

108  /  209 

205/56 

74.37  /  78.30 

0.261 

8 

VSL 

56  /  210 

301/26 

70.76  /  80.67 

0.235 

9 

AALH/UN 

190  /  283 

1/0 

84.36  /  76.S2 

0.140 

10 

WOB,  AALH 

21/32 

16  /  10 

96.97  /  96.56 

0.856 

11 

WOB,  VCL 

23/30 

12/11 

97.13  /  96.64 

0.847 

12 

VCL,  UN. 
AALH 

23/34 

34/29 

95.33  /  94.84 

0.757 

13 

VCL  UN. 
MALH 

24/38 

40  /  30 

94.76  /  94.43 

0.746 

14 

VCL  LIN. 
VCL*  AALH 

24/40 

50  /  36 

93.78  /  93.78 

0.729 
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Figure  6.  CART  deciaion  tree  for  determining  hyperactivation . 

Scatterplot  of  VCL  and  WOB 


WOB 


igure  7.  Scatterplot  showing  the  association  between  actual 

hyperactivated  motility  and  predicted  hyperactivated 
motility  using  the  CART  model,  where  the  quadrant  :•! 
signifies  predicted  hyperactivated  motility. 
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cells  were  misclassified  for  an  efficiency  of  98.9%.  Cross 
validation  attempts ,  holding  out  randomly  selected  subsets, 
consistently  identified  WOB  and  VCL  as  the  important  motility 
parameters.  Application  of  this  decision  rule  to  our  data  appears 
as  Figure  7. 

The  use  of  LIN,  AALH,  and  VCL  was  also  investigated.  CART  did 
not  choose  VCL.  The  tree  was  slightly  more  complex,  having  five 
nodes  instead  of  three  as  above.  The  classification  efficiency 
was  96.5%.  When  a  model  based  on  WOB  and  AALH  was  attempted,  CART 
did  not  choose  to  use  AALH,  opting  instead  for  a  rule  based  only 
on  WOB  for  an  efficiency  of  97.0%,.  Other  runs  using  linear 
combinations  of  variables  were  attempted  but  resulted  in  more 
complex  decision  trees. 

7.  MODEL  COMPARISON.  Figure  8  illustrates  decision  criteria 
delivered  by  the  discriminant  and  CART  models  using  VCL  and  WOB. 
To  understand  the  model  differences  we  have  partitioned  the  point 
set  WOB  X  VCL,  where  WOB  ranges  from  0.0  to  1.0  and  VCL  ranges 
from  0  to  350,  according  to  the  hyperactivity  decision  rules  for 
each  model.  A  cell  whose  WOB  and  VCL  values  locate  it  in  a  shaded 
region  would  be  classified  non-hyperactivated  by  CART.  The 
unshaded  region  corresponds  to  a  hyperactivated  classification 
delivered  by  CART.  The  bold  line  represents  the  discriminant 
model.  Points  falling  below  that  line  would  be  classified  as 
hyperactivated,  and  above,  non~hyperactivat^.  Within  each  region 
we  have  indicated  the  true  number  of  hyper.  ,civated  and  non- 
hyperactivated  cells  present.  From  this  one  can  assess  their 
relative  performance,  and  will  find  the  CART  model  to  be  slightly 
better. 

8,  APPLICATION.  An  experiment  was  conducted  in  which  sperm 
cells  were  expos*e3  to  metal  ions  in  four  concentrations  over 
four  different  time  periods.  This  factorial  design  was  run 
within  blocks  (different  rabbit  donors) .  Cells  were  identified 
as  hyperactivated  or  non-hyperactivated  by  the  CART  model  estab¬ 
lished  above.  Initial  graphical  analywsis  (Figure  9)  suggests  an 
adverse  effect  induced  by  increased  exposure  to  lead  on  the 
percentage  of  motile  cells  which  exhibited  hyperactivated  motion. 
Since  lead  is  known  to  be  a  reproductive  toxicant,  this  might 
suggest  that  one  impact  is  in  its  inhibition  of  hyperactivation. 


9.  SUMMARY.  A  classification  problem  in  reproductive  toxi¬ 
cology  was  approached  using  well  known  statistical  procedures. 
We  found  classification  criteria  involving  a  different  set  of 
motility  parameters  then  what  had  been  suggested  in  the  litera¬ 
ture.  Further,  the  combination  of  WOB  and  VCL  performed  better 
than  the  popular  set  of  VCL,  AALH,  and  LIN.  Application  of  the 
new  model  is  now  being  made  to  help  uncover  potential  reproduc¬ 
tive  toxicants. 
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1  Comparison  of  Models 
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Figure  8.  Comparison  of  models,  where  the  unshaded  region  and  the 
half-plane  below  the  discriminant  model  denote  regions 
for  predicted  hyperactivated  motility  by  CART  and 
discriminant  analysis,  respectively,  and  the  actual 
counts  for  hyperactivated  (H)  and  non-hyperactivated 
(N)  motility  are  given. 
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Figure  9.  Smoothed  scatterplot  showing  a  possible  interaction 
effect  on  the  percent  of  motile  cells  which  are 
hyperactlvated  (PMOTHY)  attributable  to  lead  exposure 
expressed  in  terms  of  time  and  concentration  . 
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Abstract 

This  article  presents  an  analysis  of  the  small-sample  distribution  of  a 
class  of  approximate  pivotal  quantities  for  a  normal  coefficient  of  variation 
which  contains  the  approximations  of  McKay  ( 1932),  David  ( 1949),  the  ‘naive’ 
approximate  interval  obtained  by  dividing  the  usual  confidence  interval  on  the 
standard  deviation  by  the  sample  mean,  and  a  new  interval  closely  related  to 
McKay  (1932).  For  any  approximation  in  this  class,  a  series  is  given  for  e(i), 
the  difference  between  the  cdfs  of  the  approximate  pivot  and  the  reference 
distribution.  Let  k  denote  the  population  coefficient  of  variation.  For  McKay 
(1932),  David  (1949),  and  the  'naive'  interval e(t)  =  0(k3),  while  for  the  new 
procedure  e(t)  =  C(k4),  Examples  involving  strength  data  for  a  composite 
material  are  discussed. 

Key  Wordsi  Noncentral  t  distribution,  chi-squared  approximation,  McKay’s 
approximation 


1  Introduction 

If  X  is  a  normal  random  variable  with  mean  n  and  variance  a3,  then  the  parameter 


is  called  the  population  coefficient  of  variation.  Let  Y,  for  i  =  l,...,n  be  an 
independent  random  sample,  with  Xi  ~  N(^,cr3)  for  each  i.  In  terms  of  the  usual 
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•ample  estimates  of  the  normal  parameters 


ini 


(2) 


and 


a  point  estimate  of  (1)  is 


Sa  =  £(*  -  *)*/( n  - 1), 

i-1 

K  b  S/X. 


(3) 

(4) 

This  statistic  is  widely  calculated  and  interpreted,  often  for  very  small  n,  usually 
without  an  accompanying  confidence  interval.  An  exact  method  for  confidence 
intervals  on  n  based  on  the  noncentral  t  distribution  is  available  (Lehmann,  1986, 
p.  362)  but  it  is  computationally  cumbersome;  hence  the  need  for  approximate 
in  tor  vale.  In  this  article,  we  will  investigate  an  approximate  pivotal  quantity  which 
can  be  used  to  easily  calculate  confidence  intervals  and  perform  hypothesis  tests  on 
k  which  attain  very  nearly  the  nominal  confidence  level  or  size.  These  calculations 
require  only  standard  tables. 

Let  Yv  denote  a  x2  random  variable  with  v  =  n  -  1  degrees  of  freedom,  and 
define  Wv  £  Y^/u.  For  a  6  (0, 1),  let  x2,*  denote  the  100a  percentile  of  the 

distribution  of  YV)  and  let  t  =  be  the  corresponding  quantile  of  Wv.  Define 

the  random  variable 

*ao+«a)  , , 

Q  (i +**•>)«>’  (6) 

where  0  =  0(v,a)  is  a  known  function.  If  we  choose  0  so  that 

Pr($  <  t)  »  Pr(W„  <  f)  (6) 


then,  since  the  distribution  of  Wv  is  known  and  free  of  n,  we  can  use  Q  as  an 
approximate  pivot  for  constructing  hypothesis  tests  and  confidence  intervals  for 
k.  We  define  the  accuracy  of  the  approximation  (6)  to  be  e(t)  &  p  -  a,  where 
p  =  Pr(<?  <  t).  Note  that  p  is  the  actual  confidence  level  of  a  one-sided  confidence 
interval  for  k3,  based  on  Q,  having  nominal  confidence  a.  In  Section  2,  we  give 
a  Taylor  series  expansion  for  e(t)  in  powers  of  k3,  leaving  the  details  to  the  Ap¬ 
pendix.  We  then  consider  four  choices  for  0\  corresponding  to  the  approximations 
of  McKay  (1932)  and  David  (1949),  to  the  ‘naive’  approximate  interval  obtained 
by  dividing  the  usual  confidence  interval  on  the  standard  deviation  by  the  sample 
mean,  and  to  a  new  interval  closely  related  to  McKay  (1932). 
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McKay  (1932)  proposed  that  Q  and  W„  are  approximately  equal  in  distri¬ 
bution  when  6  =  u/(v  +  1),  but  he  was  unable  to  investigate  the  small-sample 
distribution  of  Q.  Consequently,  Fieller  (1932)  and  Pearson  (1932)  performed  a 
simulation  study,  with  satisfactory  results.  David  (1949)  proposed  McKay's  ap¬ 
proximation  with  6  st  1;  this  suggestion  has  received  much  less  attention  than 
McKay  (1932).  Much  later,  Iglewicz  and  Myers  (1970)  compared  selected  quan¬ 
tiles  of  the  approximate  distribution  for  K ,  obtained  from  Q  with  McKay’s  choice 
of  0,  with  the  corresponding  exact  values  obtained  using  the  noncentral  t  distri¬ 
bution.  This  numerical  investigation  demonstrated  that  McKay's  approximation 
is  very  good,  at  least  for  n  >  10  and  0  <  k  <  .3.  Instead  of  examining  differences 
in  quantiles  numerically,  we  will  investigate  differences  in  cdfs  analytically,  and 
thereby  develop  a  deeper  understanding  of  the  small-sample  properties  of  these 
approximations. 


2  A  Taylor  Series  for  eft) 

Denote  the  distribution  of  Wv  by  Hv(<)  so  that,  for  0  <  a  <  1,  Hv{t )  =  a.  Since 
ii(a>)  =  x/(0x  +  1)  is  a  monotone  function  with  inverse  u~*(y)  =  y/(l  -  0y), 


-*[(£)■ 5  ^ 


=  Pt  (tt?P  5  ‘TT?)  « 


sp. 


For  a  given  choice  of  0(*/,a),  we  have  defined  the  accuracy  of  the  corresponding 
approximation  to  be  e(t)  s  p  -  a.  In  the  Appendix,  we  show  that 


>(<)  =  <«(<)  j  [a 


i  + 


V+  1 


(8) 


-6  +  1 1 1/  -  6  y3  +  t/3  --  3  >/  <  +  6  y3  f  -  3 1/3 1  +  3 1/3 13  -  t/3 13 

TfT+T)3 

(1  -  v  +  2t/f)  (1  -  0t)  (1  -  v  +  ut)  (2  -  i/+  vt)  (1  -  0tf 

+  1  +  v  +  2(1  +  1/) 
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For  McKay's  approximation,  0  —  =  v/{v  +  1),  and  (8)  becomes 


«i(t)  fs  tHx 


:(<){ 


-2k2 

v+1 


2  +  9v  +  2v2-  3k"1  +  2 1»4 
2(1+1/)* 


(9) 


-3  v  t  +  9  v7 1  +  5  s'3 1  -  7  vA  t  -  5  v9  t2  -  2 v3  <a  +  9  i/A  <2  -  8  u*  t3  +  t/*  t* 


2(1  +  k)s 

+0(*8)}  • 

David  (1949)  proposed  Q  with  6  -  «  1  as  an  approximate  pivot  for  a  normal 

coefficient  of  variation.  The  accuracy  of  David's  approximation  is 

(<  -  2)k2  4  +  14»/ -  10i/2  +  4  i/3 

y  +  1  +  4(1  +  1/)* 


«a(<) 


■  «*t(o{- 


(10) 


+  -16<  +  4i/<  +  18  v2 1  -  14 1/3 1  +  6 12  -  lbut2  -  6 1/3 12  + 18 1/3 12 

4  (!  +  „)• 

5  t/f3  -  4i/2<3  -  10 t/3  <3  +  2  */2  f4  +  2 1/3 f 4 


+  0(«*)} . 


4(1  +  uy 


K 


Another  reasonable  choice  for  0  is  0(3)  s  l/t,  Confidence  intervals  are  obtained  for 
this  choice  of  0  by  simply  dividing  the  endpoints  of  the  usual  confidence  interval 
for  <7  by  X.  The  corresponding  approximation  has  accuracy 


«,(<)  = 

-6  +  11  v  -  8  vi  +  v3  -  3  v  t  +  6  va  t  -  3  v3t  +  3  v3  ta  -  v3 t3 


+ 


4(1  + if 


Finally,  note  that  if 


0  s  0<4>  = 


(»/  -+  l)t  V  +  1  V  +  1 


Xuyt 


+ 1 


(11) 

K4  +  <V) J  . 
(12) 


then  the  0(k2)  term  in  (8)  is  zero,  and  we  have  an  approximation  with  accuracy 
-2-3*'+12i/2-9i/3  +  2t/4  +  i/t-  lbi,*t  +  2lv3t~  7vKl 


•4(0 


E  tUl(t)  | 


+ 


2(1+  v)* 

5  v3 12  -  16  u3 12  +  9 1/4  i2  +  4  v3 t3  -  5  u*  t3  +  vA  iK 


2  (1  +  i/)3 


K4  +  0(K«)J  .  (13) 
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We  will  refer  to  the  approximations  corresponding  to  these  four  choices  of  0  as 
Approximations  1-4,  or  the  McKay,  David,  Naive,  and  Modified  McKay  approxi¬ 
mations,  respectively. 

3  Discussion 

If  k  is  small,  as  is  usually  the  case  in  practice,  e^(t)  will  also  be  small  for  j  - 
1, . . . ,  4,  so  any  of  the  above  approximations  will  be  satisfactory.  For  large  sam¬ 
ples,  00)  n  1  for  j  3;  hence  the  three  corresponding  methods  are  asymptotically 
equivalent.  Investigation  of  «i(t)  and  ej(f)  demonstrates  that  David’s  approxima¬ 
tion  is  not  clearly  better  than  McKay's,  and,  in  any  case,  McKay’s  method  is  much 
more  often  used  than  David's.  Also,  Approximation  3,  though  adequate  if  k  is  suf¬ 
ficiently  small,  is  substantially  less  accurate  than  the  other  three  approximations. 
We  will  therefore  not  consider  David’s  and  the  Nai've  approximation  further,  and 
restrict  attention  primarily  to  the  McKay  and  Modified  McKay  approximations. 

Denote  «(*)  regarded  as  a  function  of  a  by  £(•),  that  is  2(a)  £  e[HyX(a)]. 
The  difference  <p(a)  m  |#i(a)|  -  |l<(a)|  will  be  positive  when  the  Modified  McKay 
approximation  is  more  accurate  than  McKay's  approximation,  and  negative  oth¬ 
erwise.  Hence  this  difference  provides  a  means  for  comparing  these  two  methods. 
Using  the  noncentral  t  distribution,  it  is  straightforward  to  evaluate  (p(a)  exactly, 
and  this  is  preferable  to  using  the  approximate  formulas  of  the  previous  section. 
In  Figure  1,  results  are  displayed  of  computing  p(a)  numerically,  for  20  values  of  k 
between  .025  and  .5;  for  sample  sites  of  2, 5, 10,  and  25;  and  for  a  equal  to  .01,  .05, 
.95,  and  .99.  Note  that  the  Modified  McKay  method  is  usually  more  accurate  than 
McKay’s  method.  What  is  not  clear  from  these  differences  is  that,  particularly 
when  k  is  small,  the  Modified  McKay  approximation  is  often  extremely  accurate: 
in  fact,  virtually  exact.  This  point  is  made  by  Figure  2,  which  shows  the  accura¬ 
cies  of  these  two  methods  (as  determined  from  the  noncentral  t  distribution),  as 
functions  of  a,  for  a  sample  size  of  5,  and  for  k  -  ,05  and  k  =  .25,  respectively. 

4  Confidence  Intervals  and  Hypothesis  Tests 

In  this  section,  we  illustrate  how  the  approximate  pivot  (5)  can  be  used  for  approx¬ 
imate  confidence  intervals  and  one-  and  two-sample  hypothesis  tests.  We  assume 
that  k  is  positive,  and  that  the  probability  of  X  being  negative  is  negligible. 
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A  100(1  -  a)%  approximate  confidence  interval  based  on  (5)  is 


K 

v/WA’  +  l)-#3]  ’ 


(14) 


where  <i  s  /a/*'  and  U  a  xlia/i/v-  One-sided  intervals  can  be  determined 
similarly.  If  we  let  u,  s  vUt  for  t  =  1,2,  then  we  can  write  the  McKay  and 
Modified  McKay  confidence  intervals  as 


As 


t*i  +  2 
v  +  1 


respectively. 

Since  (1 +«a)/«a  in  (5)  is  a  monotone  function  of  Ka,  we  can  also  use  (5)  to  test 
he  null  hypothesis  Hq  :  k  =  «oi  for  some  known  k0.  An  endpoint  of  the  interval 
(14)  does  not  exist  if  t(9K2  +  1)  -  K2  <  0,  or  equivalently, 


K*Z0 


1-t' 


(17) 


In  order  for  (17)  to  hold  for  the  choices  of  0  considered  in  this  article,  either  K2 
must  be  large  or  t  must  be  small.  Neither  of  these  conditions  are  likely  to  occur  in 
practice  except  possibly  when  n  and  t  are  both  very  small.  If  K  is  small  but  (17) 
holds,  then  one  can  either  reduce  the  confidence  level,  increase  the  sample  size, 
or  else  use  the  exact  method  based  on  the  noncentral  t  distribution.  Note  that  if 
$i  *=  =  l/t<  for  t  =  1,2,  then  (14)  becomes 

Aa  =  (Kfy/h,  K/sfc)  ,  (18) 


which  is  the  usual  interval  on  o,  with  the  endpoints  divided  by  X. 

Assume  that  we  are  given  two  independent  random  samples  of  sizes  ni  and  nj, 
having  population  coefficients  of  variation  «i  and  kj,  with  sample  estimates  K\ 
and  Hi,  respectively.  From  (6)  we  see  that 


A?(l  +  flir|) 
*!(!  +  **?) 


*?(1  +  *a) 


1  VlM 


s  tF. 


(19) 
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where  denotes  an  F  random  variable  with  v\  =  n\  -  1  numerator  and 
Vi  =  rvj  -  1  denominator  degrees  of  freedom.  When  «i  =  «a,  r  =  1,  and  it  is 
easy  to  show  that  r  is  monotone  increasing  in  p  &  In  fact,  since  we  are 

assuming  that  both  and  kj  are  small,  r  es  p.  Hence,  we  have  an  approximate 
F  test  for  the  equality  of  two  coefficients  of  variation,  analogous  to  the  usual  F 
test  for  the  equality  of  variances. 

5  Examples 

The  tensile  strength  of  five  specimens  of  a  composite  material  are  as  follows  (in 
1000  Psi):  326,  302,  307,  299,  329.  We  have  Xi  =  312.6  and  Si  =  13.94,  so 
that  K\  t=  .045,  t»i  *  xJ.jrs  -  H-14,  and  uj  =  X4„oas  —  .4844.  Equations  (15) 
and  (16)  lead  to  confidence  intervals  on  k  by  the  McKay  and  Modified  McKay 
methods,  respectively.  For  this  example  the  Modified  McKay  procedure  gives  the 
80%  confidence  interval  (.0269,  .1299),  which  differs  from  the  McKay  interval  only 
in  the  fourth  decimal  place. 

Five  specimens  of  the  same  material  are  tested  in  shear,  giving  shear  strengths 
as  follows  (in  1000  Psi):  9.7,  9.6,  9.4,  9.4,  10.9.  Fbr  these  shear  data,  Xa  =  9.8, 
5s  =  .6285,  and  Jfj  =  .064.  To  test  the  null  hypothesis  that  the  population 
coefficient  of  variation  for  tensile  strength  equals  the  corresponding  value  for  shear 
strength,  we  compute  (for  the  McKay  method) 


Since  the  probability  that  an  random  variable  is  less  than  .495  is  .256,  there 
appears  to  be  insufficient  evidence  to  reject  this  null  hypothesis.  Note  that  the 
Modified  McKay  method  Is  not  appropriate  for  this  significance  test  since  0<4)  is 
a  function  of  a. 

6  Conclusion 

A  class  of  approximate  pivotal  quantities  for  a  normal  coefficient  of  variation  re¬ 
lated  to  the  approximation  of  McKay  (1932)  has  been  investigated  analytically, 
with  particular  emphasis  on  four  special  cases.  The  most  important  results  are 
that,  if  k  denotes  the  population  coefficient  of  variation,  then  the  difference  be¬ 
tween  the  actual  and  nominal  levels  of  McKay’s  (1932)  confidence  interval  are  of 
0(k3),  and  that  a  very  slight  modification  of  McKay’s  method  leadB  to  an  appar- 


.045J[1  +  (.8)(.0642)j 
.084»[l  +  (.8)(.045a) 
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ently  new  0[ka)  method  which  is  usually  superior  to  McKay  (1932))  end  which  is 
recommended. 


Appendix:  Derivation  of  Equation  (8) 


For  moit  application!  «  will  be  imall,  10  our  plan  is  to  let 

5  ■  *  1  + (!-«<),» 


(21) 


and  to  expand  Pr[(i(/K)2  <  qj  in  a  Taylor  leriei  in  k2,  then  to  expand  each  term 
in  this  leriei  again  in  powers  of  «2,  using  (21).  We  assume  throughout  that  q  is 
nonnegative;  this  imposes  slight  restrictions  on  a  and  tc  which  are  not  important 
in  practice. 

The  random  variables  X  and  S  are  equal  in  distribution  to 


X  *  m  +  Za/y/ri 


(22) 


and 

5  =  os/W„  (23) 

respectively;  where  Z  ~  N(0, 1),  and  Z  and  Wu  are  independent.  Hence 


(24) 


where  T^g  denotes  a  noncentral-t  random  variable  with  degrees  of  freedom  v  and 
noncentjality  parameter  6  as  y'n/x.  By  conditioning  on  Z  and  expanding  in  a 
Taylor  series  about  Z  —  0,  we  have  that 


f.Pr  (£)1<9J  -*{jr,[«(l  +  ^)1  }»*,(«) +  ,*•'(,)•{  (25) 

(1  -  q)u  -  1  a  ,  -8  4-  llv  -  6v2  + 1/3  -  3 vq  +  6i/2?  -  3 v*q  +  3 v3g2  -  v3q3  A 
*+l  *  +  4(n  +  l)>  * 

+<V)} . 

Using  (21),  the  terms  in  (25)  can  now  be  expanded  in  poweru  of  k2  about  k 2  =  0, 
giving 

H„{q)  =  #„(<)  +  t{0t  -  (26) 

+  t{6t  -  l)a/2  (2 tf'(i)  +  k*  +  0(k8), 
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«;(?)  =  ni(t)  +  net  -  +  0(«4), 


(27) 


and 


(l-»-l_(i-»-l  («- W  -  +  Q(k')  (28) 

Using  the  identity 

«’*:(<)=  [5(1  -o-»]  m'm,  (») 

substituting  (26),  (27),  and  (28)  into  (25),  and  collecting  terms  in  *a  leads  to  (8). 
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,10  and  25 


Figure  2:  Exact  Accuracies  of  McKay  and  Modified  McKay  Vs.  Alpha 

(n=5,  kappa=.05  and  .25) 


Alpha  (on  logit  scale) 
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Summary 

This  paper  diaeuaaea  the  estimation  of  the  parameter!  of  the  Cosinor  model.  The  standard 
periodogram-baaed  approach  of  Walker  (1971)  produces  biased  estimates  of  ail  the  parameters,  and  it  will 
be  shown  that  the  bin  in  the  frequency  estimate  can  be  substantial.  An  alternative  periodogram -based 
estimator  of  frequency  is  proposed  and  is  shown  to  have  minimal  bias. 

Some  hey  words :  Cosinor  model;  Frequency  estimation;  Periodogram;  Time  series. 


1.  Introduction 

Suppose  we  have  a  time  series  y„  •  m  i,  2, . . .,  n ,  where  y,  is  the  observation  taken  at  time  t. 
The  model  we  consider  is 

y,-aqcoa(©0O+|l0ain(cM  >**|. 

where  the  errors  e,  are  assumed  to  be  independent  with  E  (O  ■  0  and  Var  (ej  ■  o*  for  all  r.  This  model 
was  proposed  for  the  analysis  of  biological  rhythms  by  Halberg,  Tong  and  Johnson  (1965),  who  called 
it  the  Cosinor  model.  Further  details  have  been  given  in  Halberg,  eu  si.  (1972),  Nelson,  et.  al.  (1979)  and 
Bingham,  et  al.  (1982).  This  model  has  been  extensively  used  and  reputed  in  the  chronobiology  literature, 
and  computer  programs  for  its  implementation  have  been  published  by  Monk  and  Fort  (1983)  and  Vokac 
(1984). 

In  matrix  notation,  this  model  can  be  expressed  as 

y  -  .K0b0  +  e 

where  y  is  the  ft  x  1  vector  of  the  observations,  Xt  is  the  n  x  2  design  matrix  with  cos(<i\,  t)  in  row 
t  of  the  first  column,  and  sin(<0g  i)  in  row  t  of  the  second  column.  The  2  x  1  vector  do  equals  ( oto, 

) 1  and  e  is  the  n  x  1  vector  of  error  terms.  In  the  conventional  Cosinor  model,  the  frequency  (or 
equivalently  the  period)  is  considered  to  be  fixed  and  known,  and  the  subscript  0  indicates  that  Xt  is  a 
function  of  the  true  parameter  value  %  In  this  case,  the  model  is  linear  in  the  unknown  parameters  Og 
and  po,  and  the  usual  least  squares  estimates  of  do  apply. 

Many  times,  however,  we  may  not  know  the  true  frequency  and  we  need  a  way  to  estimate  all 
the  parameters  simultaneously.  A  common  approach  is  to  use  the  method  of  nonlinear  least  squares 
estimation.  For  the  Cosinor  model,  these  methods  have  problems  with  converging  to  local  rather  than 
global  minima,  and  are  extremely  sensitive  to  the  choice  of  starting  values.  Alternative  methodology  is 
desirable. 
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2.  Standard  pbriodooram  hj  hmators  for  the  cosinor  model 

A  natural  approach  to  this  estimation  problem  would  be  to  use  a  method  based  on  the 
periodogram,  or  some  function  of  it.  The  periouogram  has  long  been  used  in  the  hidden  periodicity 
problem,  and  E.T.  Jaynes  (1987)  has  demonstrate.!  that  it  is  a  sufficient  statistic  for  inferences  about  a 
single  stationary  frequency,  when  the  errors  are  normally  distributed. 

Periodogram-based  estimators  for  the  Cosinor  model  were  proposed  by  Whittle  (1952),  and  have 
been  extensively  discussed  by  Walker  (1971),  who  derived  many  of  their  properties.  The  estimators  are 

d(d>)  -  —  J^cos^f)  (2) 


(3) 
n  i.i 


where  q  is  such  that 


and 


iWwP+IW). 


The  expression  /„(  co)  is  one  of  the  usual  definitions  of  the  periodogram,  and  we  wiU  refer  to  it  as 
the  Standard  periodogram.  The  estimator  is  defined  as  the  value  of  to  that  results  in  the  absolute 
maximum  of  /.(to)  for  0  <  to  <  re,  This  estimate  of  w  is  used  in  (2)  and  (3)  to  get  the  estimates  of  a. 
and  (V  These  estimators  are  equal  to  the  least  squares  estimates  when  go,  is  known  and  n  -  cP0,  and  are 
called  approximate  least  squares  estimates  (Bloomfield,  1976).  Walker  proved  that  they  are  consistent  and 
asymptotically  normal,  and  gave  expressions  for  their  asymptotic  variance  matrix.  Rire  and  Rosenblatt 
(1988)  show  that  the  estimates  of  %  and  {J0  will  be  consistent  only  when  w,,  is  estimated  with  precision 
o  (n  ),  and  that  the  asymptotic  theory  should  be  used  cautiously, 

In  obtaining  the  Walker  estimates  of  frequency,  Digglc  (1990)  suggested  considering  all 
frequencies  0  <  (U  <  n,  not  just  the  Fourier  frequencies.  The  algorithm  searches  for  the  ordinate  that  results 
in  a  maximum  along  a  grid  of  specified  length,  centcied  on  the  Fourier  frequency  that  produces  the 
maximum  periodogram.  This  type  of  approach  is  also  discussed  in  Rice  and  Rosenblatt  (1988)  and  Zhao- 
Ouo  (1988). 

The  Walker  estimators  have  good  asymptotic  properties,  but  can  be  significantly  biased  for  a 
moderate  length  lime  scries.  If  the  model  holds  and  E(y)  -  o^osO^  t)  +  Mnfo*,  /).  then 
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where 


£(A(w)|co)  -  Hi  [  C,(»+w0)  +  C,(co-«0)] 
+  i  [S,(o)+©9)  -S,(w-o>0)], 


C» 


sln(^.) 

sin(^) 


This  expression  simplifies  to  the  following  if  to  ■  »„ 

£(&(«)  |  w)  -a#+^“C,(2a)0)  *h  S.(2co0) 

/I  /I 


Similar  expressions  hold  for  ft,, 

Any  bias  or  estimation  error  in  &  may  result  in  additional  bias  in  the  estimates  of  and  f), 
and  even  if  we  are  able  to  estimate  o\,  exactly,  our  estimates  of  Oq  and  ft,  will  still  be  biased  if  n  is  not 
an  integer  multiple  of  the  true  period. 

3.  BIAS  in  the  standard  periodogram  estimator  op  frequency 

There  has  been  little  published  on  the  bias  of  the  Walker  estimator  of  a\,.  The  exact  bias  has  not 
been  determined,  but  Bloomfield  gives  an  indication  of  the  approximate  bias,  credited  to  Whittle  (1932), 

as 


£(<b)  ■  a)„  *  terms  involving  -1 . 

n 

Rice  and  Rosenblatt  also  discuss  the  bias  of  the  frequency  estimate,  and  show  for  a  moderate 
length  data  aeries,  tho  bias  can  be  significant.  In  a  simulation  with  oij  -  8,  (Sq  «  0,  ■  0.3,  and  «-100, 

the  bias  was  shown  to  be  .0013,  which  is  more  than  twice  the  standard  error  indicated  by  the  asymptotic 
theory. 

An  analytic  expression  for  the  bias  of  the  frequency  estimator  cannot  be  derived.  As  an 
alternative,  we  will  consider  an  approximation  suggested  by  the  work  of  Rice  and  Rosenblatt  (1988).  To 
measure  the  bias  in  the  estimator  of  to,  we  will  approximate  E(fi>)  by  the  value  of  to  that  maximizes 
the  expected  value  of  the  periodogram. 

To  derive  the  expectation,  the  periodogram  is  reexpressed  in  matrix  notation.  As  in  Section  1, 
define  the  n  x  2  matrix 
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X(a>) 


r  \ 

cos(to)  sin(w) 

cos(2co)  sin(2(D) 


(^cos(no))  sin(na>)J 


For  simplicity,  we  will  write  it  as  X  in  the  discussion  that  follows,  remembering  that  it  is  actually  a 
function  of  to.  Again,  let  us  denote  X  (Wo)  as  X0.  The  Walker  estimators  of  otg  and  P0  can  then  be  written 
in  matrix  notation  as 


K 


and  the  Standard  periodogram  can  be  expressed  as 

/.(co)  -  *  M* 


which  is 


/.<«> 


ly'XX'y 

It 


The  periodogram  in  matrix  notation  has  a  familiar  form  from  the  study  of  linear  models.  The 
matrix  XX'  is  square  (n  x  n)  and  symmetric,  making  the  periodogram  a  positive  semideflnite  quadratic 
form  y  'Ay  with  matrix  A  -  (2/n)XX '. 

The  expectation  is  easily  derived  using  properties  of  quadratic  forms.  The  Cosinor  model  can 
be  written  as  y,  ■  s,  +  e, ,  for  t  -  1, 2, . . .  n,  with  j,  ■  a,  cos  ( t )  +  ft,  sin  ( (i\,  t ).  In  this  form,  the 
vector  y  is  composed  of  a  signal  vector  j,  with  c*  element  r, ,  and  a  vector  of  errors  e.  For  the  general 
signal  plus  noise  model,  the  expectation  of  the  corresponding  quadratic  form  is 

E(y'Ay)  ■  oJ»rflce(A)  +  s' As 


Applying  this  to  the  Standard  periodogram  we  have 


Is'XX's  . 
n 


The  first  term  has  no  effect  on  the  location  of  the  maximum,  so  only  the  second  term  will  need  to  be 
considered. 

The  expected  periodogram  is  a  function  of  the  true  parameter  values  Oq,  po,  co0  and  the  sample 
size  n.  Given  values  o>  these  parameters,  we  can  find  the  to  that  results  in  the  maximum  and  obtain  an 
approximation  to  the  bias.  In  general,  for  given  values  of  co^  and  n,  it  can  be  shown  that  the  bias  will  be 
the  same  for  all  c^  and  po  such  that  a, » Jtpo.  The  bias  depending  on  k  is  equivalent  to  the  bias  depending 
on  the  phase  90  ■  arctan2(-P0/a0),  since  tan  (0O)  ■  -  k  when  =  k%.  We  consider  0O  equal  to  0,  n/4,  n/2 
and  3n/4.  The  values  selected  for  the  true  frequency  (Og  are  the  midpoints  of  each  quarter  of  the  interval 
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(0|7tj.  Sample  sizes  of  n  ranged  from  10  to  SOO.  In  order  to  ensure  that  we  find  the  global  maximum  of 
the  expected  periodogram,  we  use  a  grid  search  algorithm.  The  expected  periodogram  is  first  calculated 
at  the  Fourier  frequencies,  and  the  Fourier  frequency  that  results  in  the  largest  periodogram  ordinate  is 
identified.  This  frequency  is  G>„.  Using  to  m  as  the  center  of  the  grid,  a  refined  search  with  a  grid  mesh 
of  0.0001  is  made  from  >>  m ,  3  to  to  „ ,  j,  representing  a  range  of  four  Fourier  frequencies.  Extensive 
simulations  show  that  th  expected  periodogram  approach  gives  excellent  estimates  of  the  true  bias. 

Figure  3.1  gives  a  graphical  picture  of  the  bias  in  the  estimator  of  frequency  for  one  set  of 
parameters,  60  ■  0  and  true  frequency  <o0  -  n/8.  It  shows  that  the  bias  itself  is  a  periodic  function  of 
sample  size.  This  type  of  pattern  was  seen  for  all  combinations  of  parameter  values  we  looked  at,  though 
the  bias  was  sometimes  negative,  or  alternated  from  positive  to  negative  with  increasing  n. 


Figure  3.1 

Bias  in  the  Standard  Periodogram  Estimate 
0-0  ca  ■  n/8 


4.  Improved  estimators  for  the  cosinor  model 

We  note  the  similarity  between  Walker’s  estimators  and  least  squaics  estimators  in  Section  1.  For 
non-Fourier  frequencies,  the  Walker  estimates  of  and  (30  arc  approximately  equal  to  usual  least  squares 
estimates  when  to  is  known,  and  are  equal  when  n  ■  cP0  or  when  to  is  a  Fourier  frequency.  This 
relationship  suggests  another  definition  of  the  periodogram  for  non-Fourier  frequencies,  where  we  replace 
the  "approximate"  Walker  estimates  by  the  actual  ordinary  least  squares  estimators. 
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The  ordinary  least  squares  estimators  of  do  and  p0  are  given  by 


* 


LS 


*U 


(X'Xy'X'y 


Define  the  "least  squares"  periodogram  as 

/“(©)  +  &> 


or  equivalently  as  the  quadratic  form 

/“(to)  -y'Ay  ,  A  •  JLfXiX'XyHX'Xy'X1) . 

Since  the  actual  least  squares  and  Walker  estimators  are  equal  et  the  Fourier  frequencies,  the  two 
periodograms  aie  also  equal  at  these  frequencies.  As  we  have  replaced  "approximate"  estimators  with 
"exact"  estimators  in  the  new  periodogram,  we  might  expect  our  estimate  of  frequency  to  improve  also. 

Again,  the  bias  of  the  least  squares  periodogram  estimator  of  frequency  is  approximated  by 
maximizing  the  expected  periodogram.  The  least  squares  periodogram  is  also  a  quadratic  form  with 
A  ■  n/2(X(X‘Xyl(X ‘Xy'X1)  .  Unlike  the  Standard  periodogram,  we  initially  cannot  ignore  the 
contribution  of  the  error  term,  since 


trace(A) 


In1 

n  J  -  D}(2  CO)  ' 


where 


sin(^L) 

si»(|) 


This  term  has  negligible  effect  unless  the  signal  to  noise  ratio  is  very  small,  n  is  small  or  (Oq  is  close  to 
zero  or  2r.  The  calculations  of  bias  are  performed  for  the  same  values  of  0O,  and  n  as  for  the  Standard 
periodogram. 

The  results  show  that  estimation  of  the  frequency  based  on  the  least  squares  periodogram  is  also 
biased  and  does  not  really  offer  an  improvement  over  conventional  Walker  estimates.  This  is  illustrated 
in  Figure  4.1 ,  which  reveals  an  interesting  and  possibly  useful  relationship.  In  the  cases  examined,  the  bias 
of  the  Least  Squares  periodogram  estimate  is  always  of  the  opposite  sign  us  the  bias  in  the  Walker 
petiodogram  estimate,  and  of  approximate  equal  magnitude.  This  suggests  using  a  periodogram  that  is  in 
some  sense  a  combination  of  the  two  periodograms. 
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with 


The  aimpleit  combination  ia  the  arithmetic  average  of  the  two  periodograms,  a  quadratic  form 


A  -X[JL/  +4(X'X)-1(X'X)-,]XV 


Baaed  on  the  biases  we  have  seen  with  the  Standard  and  Least  Squares  periodograms,  the 
estimate  of  frequency  based  on  the  average  periodogram  is  expected  to  give  an  unbiased  estimator  of  (% 
,  or  at  least  one  with  reduced  bias.  Calculations  and  simulations  show  that  the  latter  is  true.  The  new 
average  periodogram  estimator  has  less  bias  than  both  the  Walker  or  liast  Squares  estimators,  and  the 
bias  approaches  zero  faster  as  n  becomes  large.  See  Figure  4.2. 


Another  approach  is  to  use  a  geometric  average.  Let  us  define  the  "composite"  periodogram  by 

nL 


'.e 


Substituting  in  the  matrix  representations  of  the  estimates  gives 

/,c<(0)  -y'Ay  ,  A*X(X'X)-'X‘ . 

The  matrix  A  is  the  familiar  hat  matrix  from  linear  regression,  and  /„c(< o)  has  a  form  similiar  to  the 
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regression  sum  of  squares  in  linear  models.  Again,  when  o>  is  a  Fourier  frequency,  ( X'  X ) -1  •  2/n  /,  and 
the  Composite  periodognun  will  be  equal  to  the  Standard  periodogram. 

We  will  show  that  the  value  that  maximizes  die  expectation  of  the  composite  periodogram  is 
always  ov  Consider  the  expression 

R„(<o)  •  s's  -  s'Xtf'Xy'X's 
-  s'(,l-H)s 

where  s  is  the  signal  vector  tmd  H  m  X  ( X1  X  )  '  X1 ,  This  is  the  difference  between  the  total  sum  of 
squares  of  the  signal  and  the  value  of  the  periodgram  at  to.  The  following  are  true 

s's  >  0  s  #  0 

s'Hs  Z  0 

j'(7  -«)jiO 


The  first  statement  is  immediate  since  the  expression  s's  is  a  sum  of  squares,  while  the  second  and  third 
follow  because  H  and /  -  H  are  both  idempotent.  From  these  we  can  also  conclude  s's  i  s'Hs .  This 
means  that  the  absolute  maximum  value  s'Hs  can  possibly  attain  is  s's  *  s  'Hs  For  the  Cosinor  model 
the  signal  is  s  «  X0<p0  so  when  If  ■  X„  we  have 
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s'Hs  -  <X0'Xo(*<X)'%,*o<f,0 
•  s' s . 

Thus,  an  absolute  maximum  occurs  when  X  ■  X0 ,  i.e.  when  to  -  to^.  It  can  also  be  shown  that  the  absolute 
maximum  is  unique. 

The  key  portion  of  this  proof  is  that  •  H)s  is  idempotent,  and  thus  positive  semidefytite.  This 
does  not  occur  for  the  other  periodograms.  In  fact,  it  can  be  shown  by  numeric  example  that  s' As  can 
exceed  s's  in  all  of  the  other  periodograms. 

The  performance  of  the  Composite  Periodogram  estimator  of  frequency  was  examined  by 
simulation  study.  Prom  the  Cosinor  Model  with  Og  -  8.0,  |30  ■  0.0  and  *  0.5  a  total  of  100C  data 
series,  each  of  length  n  ■  100,  were  generated  with  random  Gaussian  noise  of  mean  0  and  variance  1. 
The  results  of  the  simulation  are  reported  in  Table  4.1.  The  Standard  periodogram  produced  biased 
estimates  of  frequency,  as  well  as  bias  in  the  estimates  of  the  other  two  parameters.  The  Composite 
periodogram  gave  an  unbiased  frequency  estimate  and  the  estimates  of  a,  and  j)0  were  greatly  improved, 
but  still  biased.  The  biases  were  expected,  since  in  section  2,  we  saw  that  even  with  co  estimated  without 
bias  or  error,  we  still  obtain  biased  estimates  far  oig  and  fig.  Using  this  fact,  we  construct  the  bias-corrected 
estimates 

A*  ■  A  -  £ C,(2<b)Z>,(2d>)  -  is„(2d>)D,(2fi>) 

ft  fl 

and 

-  $  -  £s,(2d»0,(2fi>)  +  ic,(2d»D,(2fi>) 

ft  n 

We  will  call  these  the  adjusted  Composite  Periodogram  estimators.  Simulations,  reported  in  Table  4.2, 
indicate  that  the  adjusted  estimators  are  now  approximately  unbiased. 

5.  Summary  and  conclusions 

We  have  considered  four  definitions  of  the  periodogram,  given  by 

/.(») 

/"<«)  • 

/'AV‘(<0)  *  /“(«)) 

7,c«o)  . 

These  are  called  the  Standard,  Least  Squares,  Average  and  Composite  periodograms.  The  periodograms 
are  equal  at  the  Fourier  frequencies,  differing  only  in  how  they  approximate  the  periodogram  between 
these  frequencies.  They  may  be  expressed  as  quadratic  forms  in  y,  allowing  us  to  easily  compute  their 
expectations  and  to  compute  the  approximate  bias  in  the  frequency  estimates  based  on  those  periodograms. 
Based  on  maximizing  the  expected  periodogram,  the  Standard  and  Least  Squares  periodograms 
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produce  biased  frequency  estimators  for  moderate  n.  The  Average  periodogram  estimators  are  also  biased, 
but  to  a  lesser  degree  than  either  component.  The  estimators  based  on  the  Composite  periodogram  are, 
on  the  other  hand,  unbiased  for  all  combinations  of  true  parameter  values  and  ail  /»,  The  Composite 
periodogram  also  has  a  familiar  interpretation  in  terms  of  the  least  squares  problem  of  fitting  Cosine 
curves,  making  it  easy  to  implement. 

We  also  propose  bias-adjusted  estimators  of  oig  and  ^  using  the  Composite  periodogram 
estimator  of  frequency.  Simulations  show  that  these  estimators  are  approximately  unbiased,  and  that  the 
Standard  and  Composite  Periodogram  estimators  have  similar  variances.  Based  on  these  results,  we  would 
strongly  recommend  using  our  new  estimators  for  fitting  the  Cosinor  model  to  individual  data  series. 
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Meta- Analysis  of  Gas  Flow  Resistance.  Measurements  Through  Packed  Beds 
Malcolm  S.  Taylor  and  Csaba  K.  Zoltani1 


Measurements  of  the  resistance  to  ft  jw  through  packed  beds  of  inert  spheres  have 
been  reported  by  a  number  of  authors  through  relations  expressing  the  coefficient  of  drag 
as  a  function  of  Reynolds  number.  A  meta-analysis  of  the  data  using  improved  statistical 
methods  is  undertaken  to  aggregate  the  available  experimental  results.  For  Reynolds 
number  in  excess  of  103  the  relation  log  Fv  -  0. 49  +  0.90  log  Re'  is  shown  to  be  a  highly 
effective  representation  of  all  available  data. 


Nomenclature 


Db  -  spherical  particle  (bead)  diameter 
D0  -  test  chamber  diameter 

V  -  g-yfi-v  ,  friction  factor 

APDV  *  , 

F,  -  - - -  (■—-)*,  coefficient  of  drag 

L  flu  1  —  p 

FV|  -  1-th  observed  value  of  the  drag  coefficient 
predicted  drag  coefficient  corresponding  to 
the  i-th  observed  value 
L  -  length  scale 

Re  ■  Rep  0  ■  pODb^/i,  Reynolds  number 
Re'  -  Re/(1  -  f) 


V 


Rep  -  Reynolds  number  based  on  particle  size 

Q  »  average  gas  velocity 

fi\,  1  -  0, 1, 2  -  model  coefficient 

AP  -  change  in  pressure 

p  -  density 

l  -  porosity  of  the  picked  bed 
(1-^)- solids  loading 
M  -  gas  viscosity 


1.  U.S.  Amy  Research  Laboratory,  Aberdeen  Proving  Ground,  Maryland  21005-5066,  This  material 
also  appears  in  ARL-TR-301,  November  1993. 
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1.  Introduction 

Experimental  results  are  cumulative  if  in  aggregate  they  unify  and  extend  empirical 
relations  and  theoretical  structures  which  may  be  obscured  in  individual  investigations. 
Empirical  cumulativeness,  which  Hedges  (1987)  describes  as  "...  the  degree  of  agreement 
among  replicated  experiments  or  the  degree  to  which  related  experimental  results  fit  into 
a  simple  pattern  that  makes  conceptual  sense,"  is  the  focus  of  this  paper.  Glass  (1976) 
was  among  the  first  to  recommend  the  use  of  quantitative  procedures  in  integrative  re¬ 
search  reviews  and  to  introduce  the  term  "meta-analysis"  to  cover  the  collection  of  such 
procedures.  Meta-analysis  claims  certain  classical  statistical  procedures,  as  well  as  ap¬ 
proaches  developed  specifically  for  research  synthesis,  and  has  found  application  in  the 
social  and  biological  sciences.  The  unification  of  experimental  results  obtained  by  differ¬ 
ent  investigators,  operating  independently  with  their  own  experimental  protocol  and 
sometimes  using  different  methods  of  analysis,  is  the  kernel  of  meta-analysis.  A  compre¬ 
hensive  treatment  of  this  subject  is  given  by  Hedges  and  Olkin  (1985). 

Measurement  in  the  physical  sciences  is  generally  regarded  as  highly  accurate,  and 
although  some  variability  is  inevitable,  the  variation  itself  is  thought  to  be  insignificant 
from  a  practical  standpoint.  Counterexamples  to  this  notion  are  plentiful,  even  in  careful¬ 
ly  conducted  experiments.  Consider,  for  instance,  the  situation  described  by  Touloukian 
(1975)  involving  two  sets  of  measurements  taken  on  the  thermal  conductivity  of  gadolini¬ 
um.  These  data,  shown  in  Figure  1, " ...  are  for  the  same  sample,  measured  in  the  same 
laboratory  two  years  apart  in  1967  and  1969.  The  accuracy  of  curve  1  was  stated  as  with¬ 
in  1%  and  that  of  curve  2  as  0.5%  ...  "  and  yet,  the  curves  differ  by  more  than  several  hun¬ 
dred  percent  at  higher  values  of  temperature. 
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Figure  1.  Thermal  conductivity  of  gadolinium. 
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Physical  scientists  normally  bring  a  careful  qualitative  analysis  to  their  research 
studies.  If  prudently  employed,  interrogative  statistics,  which  are  part  of  meta-analysis, 
have  a  contribution  to  make  in  the  physical  sciences  as  well. 

After  data  has  been  collected  according  to  a  carefully  constructed  experimental  de¬ 
sign  (e.g.,  see  Montgomery  1991)  the  main  reason  for  determining  a  correlation  (a  regres¬ 
sion  analysis)  is  to  examine  the  effects  that  some  variables  exert,  or  appear  to  exert,  on 
others.  Even  when  no  intuitive  physical  relationship  is  apparent,  regression  analysis  may 
provide  a  convenient  summary  of  die  data.  The  summary  can  be  accomplished  in  a  num¬ 
ber  of  ways  and  has  been  an  active  area  of  investigation  since  the  time  of  A.M.  Legendre 
(1752-1833),  who  published  the  first  account  of  regression  by  least  squares  in  1805.  Sec¬ 
tion  2  of  this  paper  reviews  the  correlations  that  have  been  advanced  for  steady  flow 
through  inert  spherically  packed  beds  and  some  of  the  consequences  of  the  attendant  data 
analysis.  In  Section  3,  a  meta-analysis  of  the  gas  flow  resistance  measurements  is  under¬ 
taken.  Section  4  contains  a  summary  and  main  conclusions. 


2.  Regression  Analysis  of  Gas  Flow  Resistance  Measurements 

Ergun  (1952),  Kuo  and  Nydegger  (1978),  and  Jones  and  Krier  (1983)  have  pro- 
posed  models  relating  coefficient  of  drag  to  Reynolds  number  for  steady  flow  through 
packed  beds  of  inert  spheres.  However,  the  correlations  were  developed  under  different 
experimental  regimens.  Robbins  and  Gough  (1978)  also  investigated  coefficient  of  drag 

at  high  Reynolds  number  but  presented  their  results  in  terms  of  a  friction  factor 
F„ 

U'm  ^  »  which  is  the  ratio  of  coefficient  of  drag  Fv,  and  Reynolds  number  Re 

scaled  by  a  solids  loading  factor  (1  -  0). 

In  comparing  Ergun’s  relation 


Fv  =  150  + 1.75( 


Re 

1-4 


), 


(l) 


to  that  of  Kuo  and  Nydegger 

Fv  «  276. 23  +  5. 05(-^-)0,87, 
1  -  <p 


(2) 


or  of  Jones  and  Krier 

Fv  «  150  +  3. 89(~^~)0,87, 
1  -  0 


(3) 


a  slight  notational  difference  portends  substantial  complications.  Equation  (1)  is  a  simple 
linear  model.  Equations  (2)-(3)  are  nonlinear  in  the  sense  that  one  or  more  parameters 
appear  nonlinearly.  Nonlinearity  complicates  the  statistical  analysis  of  the  data  since  de¬ 
termining  appropriate  choices  for  the  parameters  in  equations  (2)-(3)  becomes  a  computa¬ 
tionally  intensive  optimization  procedure,  and  inference  about  the  resultant  relation  and 
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parameters  becomes  much  more  tentative.  The  mathematical  underpinnings  of  nonlinear 
regression  will  not  support  as  much  in  the  way  of  statistical  inference  or  hypothesis  test¬ 
ing  as  is  available  for  linear  regression.  In  general,  nonlinear  models  should  be  avoided 
unless  there  is  a  compelling  reason  for  their  use.  Draper  and  Smith  (1981)  discuss  this  is¬ 
sue  in  greater  detail. 

Standard  regression  procedures  are  developed  under  several  assumptions.  Funda¬ 
mental  among  these  is  that  the  response  (here,  Fv)  is  measured  with  error  but  the  predic¬ 
tors)  (here,  Re  and  0)  are  measured  without  error.  Jones  and  Krier  provide  estimates  of 
error  for  Fv,  Re,  and  I,  confirming  that  this  assumption  is  not  met,  and  call  into  question 
the  efficacy  of  the  resultant  correlations.  Sometimes  an  attempt  to  circumvent  this  re¬ 
quirement  is  undertaken  by  arguing  that  the  error  in  predictor  measurement  is  sufficiently 
small  as  to  be  ignored  when  compared  to  the  range  of  the  predictor  variable.  If  this  claim 
is  invoked,  reliance  upon  any  resultant  representation  must  be  tempered  accordingly. 

Since  a  correlation  provides  a  convenient  representation  of  the  available  data,  a  di¬ 
rect  attempt  at  evaluating  the  adequacy  of  a  regression  equation  involves  an  examination 
of  the  differences  between  the  measurements  taken  and  the  values  predicted  by  the  equa¬ 
tion.  These  differences,  FV| — FV)'  ,1-1,2, n,  are  called  residuals;  Fv,  is  an  experi¬ 
mentally  determined  value  of  drag  coefficient,  and  Fv/  is  the  corresponding  value  predict¬ 
ed  by  the  regression  equation.  A  residual  plot  for  equation  (3)  is  shown  in  Figure  2. 
These  plots  may  serve  as  a  diagnostic  tool  in  addition  to  assessing  the  adequacy  of  a  fitted 
regression  model. 


Pamoia  But  i  mm  i 


Figure  2.  Residuals  vs.  particle  diameter  Db; 
Jones  and  Krier  data  with  6  mm  beads  excluded. 


Figure  2  strongly  suggests  that  another  crucial  regression  assumption  is  not  satis¬ 
fied.  The  variance  of  the  residuals  does  not  appear  constant  over  the  range  of  Re'  - 
Re/(1  -  0);  and  moreover,  the  departure  from  the  fitted  equation  is  systematic  with  bead 
diameter,  Db.  Jones  and  Krier  recommend  reverting  to  the  relation  (2)  proposed  by  Kuo 
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and  Nydegger  to  describe  their  own  measurements  taken  for  6  mm  beads.  This  recom¬ 
mendation  is  data  specific  and  is  difficult  to  justify  in  general.  They  conjecture  that  an  in¬ 
teraction  between  bead  size  and  tube  diameter  may  be  present,  but  this  requires  quantita¬ 
tive  substantiation,  In  general,  weighted  least  squares,  or  a  transformation  on  the  obser¬ 
vations  Fv,  before  regression,  are  potential  corrective  procedures  suggested  by  this  residu¬ 
al  pattern. 


2.1  Regression  Analysis  Revisited 

Nonlinear  regression  algorithms  normally  seek  to  minimize  the  sum  of  the  squared 
residuals — as  in  ordinary  linear  regression— in  attempting  to  determine  the  "best"  choice 
of  parameters  to  model  the  data.  These  procedures  have  previously  been  cited  as  compu¬ 
tationally  intensive.  More  specifically,  they  are  iterative  and  may  diverge  or  converge  to 
local  extrema,  depending  upon  the  choice  of  initial  conditions.  Through  a  systematic  se¬ 
lection  of  initial  conditions,  the  authors  determined  that  the  equation 

F,-61+2.7(^)M1,  (4) 

provides  an  improved  representation  of  the  data  reported  by  Jones  and  Krier. 

The  root  mean  square  error  (RMSB),  an  estimate  of  the  standard  deviation  of  the 
residuals  and  a  commonly  used  measure  for  adequacy  of  fit,  is  reduced  by  20%  compared 
to  that  corresponding  to  equation  (3).  The  measurements  taken  on  the  6  mm  beads,  the 
chief  contributor  to  heterogeneity  of  variance,  have  been  excluded  from  the  regression, 
making  the  comparison  with  Jones  and  Krier  direct.  A  reduction  of  one-fifth  in  RMSE  is 
not  by  itself  a  stunning  improvement,  but  it  does  focus  more  sharply  on  the  underlying 
physical  process.  The  residual  plot  for  equation  (4)  still  exhibits  the  undesirable  pattern 
of  under(over)  fitting  categories  of  bead  diameter,  but  is  an  improvement  compared  to  the 
display  in  Figure  2. 

The  data  collected  by  Robbins  and  Gough  (1978,  1979),  which  "...  correspond  to 
several  tests  performed  on  several  occasions"  for  beds  of  spheres,  right  circular  cylinders, 
and  multiperforated  cylinders,  ma^  be  transformed  into  units  appropriate  for  comparison 

through  the  relationship  f,'  *=  -—777 — — .  The  authors  confined  the  analysis  to  data  taken 

Re/(l  —  p) 

on  1.27  mm  diameter  lead  shot  and  on  4.76  mm  and  7.94  mm  diameter  steel  spheres,  and 
determined  the  equation 

Re 

Fv  =  -237  +  3. 14(— — -)0,89,  (5) 

1  “  <P 

for  representation  of  flow  through  spherically  packed  beds.  Equations  (4)-(5)  are  shown, 
along  with  the  previously  established  correlations  (l)-(2),  in  Figure  3. 


201 


.  snn  ifM  Kn*rlr*MMSi 


Figure  3.  Proposed  models  fa  relating  coefficient  of  drag 
and  Reynolds  number. 


Transforming  the  variables  (Re',  Fr)  by  taking  logarithms,  which  was  suggested  by 
the  residual  plot  In  Figure  2,  effectively  linearizes  the  data.  In  regression  analysis,  a  mea¬ 
sure  of  precision  of  the  regression  line  which  is  used  in  addition  to  RMSE,  is  given  by  a 
statistic  denoted  as  R3.  R2  assumes  values  in  the  unit  interval  [0,  1]  and  quantifies  the 
amount  of  variation  in  the  response  accounted  for  by  the  regression  line.  Values  close  to 
one  are  highly  desirable,  indicating  that  the  regression  has  effectively  accounted  for  most 
of  the  variation  in  the  response.  The  regression  line  determined  after  logarithmic  trans¬ 
formation  of  the  Jones  and  Krier  data  has  R2  -  0.98.  The  transformed  Robbins  and 
Gough  data  have  R2  -  0.99.  These  values  are  so  close  to  1.0  that  pursuit  of  a  nonlinear 
model  is  difficult  to  justify  mathematically. 

Comparison  between  linear  models  and  nonlinear  models  is  difficult.  RMSE  val¬ 
ues  cannot  be  compared  across  the  transformation,  and  a  well-defined  R2  statistic  for  non¬ 
linear  models  does  not  exist. 


3.  Meta-Analysis  of  Gas  Flow  Resistance  Measurements 

Consider  in  aggregate  the  correlations  that  have  been  advanced  for  gas  Sow  resis¬ 
tance  measurements  through  spherically  packed  beds.  For  the  nonlinear  models,  a  statis¬ 
tical  resampling  plan  is  applied,  whose  goal  is  to  extract  information  from  a  set  of  data 
through  repeated  inspection.  The  procedure  is  called  the  "bootstrap,"  named  to  convey  its 
self-help  attributes,  and  it  attempts  to  address  an  important  problem  in  data  reduction — 
having  computed  an  estimate  of  some  parameter,  what  accuracy  can  be  attached  to  the  es¬ 
timate?  Accuracy  here  refers  to  the  "±  something"  that  often  accompanies  statistical  esti- 
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mates,  and  may  be  conveyed  through  such  devices  as  variance,  RMSE,  or  confidence  in¬ 
terval.  For  the  log-linear  model,  the  available  data  are  directly  combined. 

The  authors  are  hindered  in  fully  exploiting  a  meta-analysis  approach  by  the  inabil¬ 
ity  to  obtain  all  of  the  pertinent  experimental  data.  It  is  unfortunate  that  experimental  da¬ 
ta  are  not  routinely  archived  after  collection;  otherwise,  additional  information  that  it  may 
hold  is  lost  to  extraction  by  subsequent  investigations  and  by  alternative  statistical  meth¬ 
ods.  the  data  of  Jones  and  Krier  and  of  Robbins  and  Qough  were  accessible.  With  these 
data,  this  paper  proceeds  as  far  as  statistical  prudence  permits. 

3.1  Bootstrapping  Regression  Correlations 

Detailed  descriptions  of  the  bootstrap  and  accounts  of  its  successful  applications 
are  amply  documented  (e.g.,  Efron  (1979,  1982)),  Efron  and  Hbshirani  (1985),  LePage 
and  Billard  (1992).  The  computational  contrivance  that  the  bootstrap  procedure  exploits 
is  the  generation  of  perturbed  data  sets  from  a  single  set  of  data  through  sampling  with  re¬ 
placement.  Specific  to  this  study,  the  set  of  paired  observations  taken  on  coefficient  of 
drag  and  Reynolds  number,  {(FVl,  Re! '),...,  (FVa,  Re„  ")>.  that  is  the  basis  for  a  reported 
correlation,  is  sampled  with  replacement  to  generate  another  set 
{(FVl*,  Ret  '*), ....  (FVb*,  Re,,  '*)}  whose  elements  are  copies  (with  duplication)  of  the 
original  measurements.  This  set  is  called  a  bootstrapped  data  set.  The  process  of  sam¬ 
pling  with  replacement  to  generate  bootstrapped  data  sets  is  repeated  many  times. 

If  a  correlation  is  determined  for  each  bootstrapped  data  set  and  its  equation  plot¬ 
ted,  an  indication  of  the  sensitivity  of  the  regression  line  to  perturbation  of  the  original 
data  comes  into  focus.  In  Figure  4,  the  results  of  1000  replications  of  this  process  are 
pictured.  The  outermost  lines  indicate  boundaries  within  which  the  correlation  (5)  might 
be  expected  to  lie  if  the  original  data  set  were  simply  perturbed.  They  were  obtained 
from  the  maxima  and  minima  of  the  drag  coefficient  predicted  for  particular  values  of 
Re'.2  The  envelope  constructed  for  correlation  (5)  contains  correlation  (4).  This  suggests 
that  no  significant  difference  between  these  empirical  relations  exists.  Similar  results  are 
obtained  if  we  begin  with  correlation  (4);  correlation  (5)  will  lie  within  the  corresponding 
confidence  envelope.  Consideration  of  perturbed  data  is  highly  appropriate  here,  since 
experimental  results  cannot  be  expected  to  be  reproduced,  even  if  the  experiment  is  repli¬ 
cated  under  tightly  controlled  conditions.  The  theoretical  justification  for  the  use  of  boot¬ 
strapped  data  is  given  by  Efron  (1982). 

The  relationship  of  Kuo  and  Nydegger,  for  which  the  experimental  data  was  not  ac¬ 
cessible,  was  determined  for  a  single  diameter  bead,  Db  =  0. 83. 


2.  Meat  precisely,  the  values  represent  extreme  quantiles  after  all  of  the  Fvr  have  been  ranked;  theii 
values  are  net  essentially  different  from  maxima  and  minima. 
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Figure  4.  Bootstrapped  confidence  envelopes  for  nonlinear  regression 
(based  on  Robbins -Gough  data). 


3.2  Log-linear  Regression 

Figure  5  displays  the  logarithmic  transformed  data  of  Jones  and  Krier.  and  Robbins 
and  Gough,  combined.  The  fitted  line  for  these  data  is 

log  Fv  ■  0. 49  +  0. 90  log  Re';  (6) 

included  in  the  regression  are  the  data  taken  on  6  mm  beads  which  were  previously  ex¬ 
cluded. 

Visually,  the  data  appear  linear  after  transformation.  Statistically,  the  Revalue  for 
the  regression  is  0.99,  making  the  fitted  line  a  highly  satisfactory  representation  of  these 
data  for  ail  practical  purposes. 
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Figure  5.  Dng  coefficient  vs.  Reynold!  nuraber/(l-0); 
Jcnei  and  Krier,  Robblni  and  Gough  data,  combined. 


4.  Summary  and  Conclusiona 

For  Reynolds  number  exceeding  10* ,  a  more  effective  representation  and  data  anal¬ 
ysis  than  presently  available  can  be  obtained  after  logarithmic  transformation  of  the  data. 
This  linearizes  the  data  and  removes  the  necessity  for  nonlinear  regression  techniques. 
The  equation 

log  Fv- 0.49 +  0.90  log  Re'  (7) 

is  an  effective  description  of  the  available  experimental  data. 

If  a  representation  of  the  form 


Fy  -  A,  +  A  (Rc/1  -  (8) 

is  required,  then  Jones  and  Krier’s  results  are  more  effectively  reflected  through  the  equa¬ 
tion 

Fv-61+2.7(-^)0'91,  (9) 

1  ”  9 

and  Robbins  and  Gough’s  data  restricted  to  spherically  packed  beds  provide  the  relation 

Fv  ■  -237  +  3.  14(t— — —  )0'89,  (10) 

1-0 
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but  here  again,  approximate  confidence  envelopes  constructed  with  the  aid  of  the  boot¬ 
strap  suggest  that  these  relations  can  be  combined  without  loss  of  underlying  physical  in¬ 
sight.  In  total,  the  statistical  analysis  supports  the  combination  of  the  various  correla¬ 
tions,  for  the  stated  test  conditions,  into  a  single  relationship. 

While  it  is  quite  reasonable  to  suspect  an  interaction  between  the  geometry  of  tube 
and  packing,  perhaps  reflected  through  the  ratio  De/Db,  more  extensive  testing  is  required 
to  establish  this  relation.  Hopefully  this  will  be  done  in  accordance  with  a  formal  statisti¬ 
cal  experimental  design  to  minimize  testing  and  maximize  extraction  of  information. 

G.EP,  Box,  an  important  contemporary  statistician,  has  remarked  that  "No  model 
is  correct,  but  some  are  useful."  In  this  spirit  these  remarks  are  offered  along  with  the 
hope  for  an  incremental  move  toward  a  more  useful  model. 
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DESKTOP  MODELS  FOR  WEAPONS  ANALYSIS 

f 

JOHN  D'ERRICO 
EUGENE  F.  DUTOXT 


DISMOUNTED  WARFIGHTING  BATTLE  LAB 


A.  INTRODUCTION 

% 

The  puipoM  of  this  report  is  to  provide  s  collection  of  simple,  desktop  computer  models  for 
operations  research  analysts  and  others  within  the  combat  developments  area. 

It  was  not  intended  that  any  of  these  models  should  replace  the  more  complex  models  available 
to  oombat  developers.  There  seems  to  be  no  lack  of  ooroplex  models,  or  efforts  to  produoe  more 
of  the  same.  This  report,  on  the  other  hand,  attempts  to  attack  the  other  end  of  the  modeling 
spectrun 

Many  operations  research  analysts,  authors  of  new  oonoepts,  action  officers  who  are  developing 
operational  requirement!  documents,  and  others,  are  not  well  served  by  the  large,  complex 
models  which  demand  much  in  the  way  of  resources,  time,  knowledge,  end  money,  in  order  to 
use  them. 

On  the  other  hand,  them  has  been  a  substantial  void  in  the  number  of  models  available  to  oombat 
developments  action  officers  to  help  them  in  their  day-to-day  work.  This  la  the  area  which  this 
report  attempts  to  resolve,  at  least  to  some  extent 

Each  model  in  this  report  has  been  thoroughly  reoearohed,  developed,  and  tested.  Ample 
raftranoea  to  source  documents  have  been  cited.  The  format  used  to  describe  each  model  was 
based  on  ease  of  understanding  and  use. 

i 

A  3.5"  disk ,  containing  all  the  models,  sample  data  files,  and  programs  described  herein,  can  be 
obtained  by  sending  a  blank,  DOS-formatted,  3.5"  doublo  density  or  high  density  disk  to: 

Commandant 

U.S,  Army  Infantry  School 
ATTN:  ATSH-WCS  (Mr.  DRirioo) 

Fort  Benning,  GA  31905-5400 
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B.  PROBABILITY  OF  HIT  MODEL 

1.  Introduction. 

a.  Description.  This  model,  developed  by  Mr.  John  DUrrico,  calculates  the  probability  of 
hit  for  direct  fire,  single  shot  weapons,  against  a  point  target.  Inputs  required  are  the  weapon's 
biases  and  dispersions,  the  target  dimensions,  the  firer's  aimpoint  on  the  target,  and  the  range  to 
the  target.  The  resulting  probability  of  hit  is  displayed  on  the  screen.  • 

b.  Limitations.  This  program  computes  the  probability  of  hit  based  on  the  measurements 
of  a  two-dimensional  target.  It  uses  a  process  similar  to  the  one  used  In  many  wargame  models. 

c.  Applications.  Desktop  analytic  tool  for  studying  weapon  accuracies  and  calculating 
probabilities  of  hit  based  on  biases  and  dispersions  provided  by  AMSAA  and  JMEM. 

d.  Setup.  This  model  runs  on  a  DOS-based  computer.  Data  can  be  entered  into  the 
model  in  a  few  minutes,  and  results  are  displayed  in  less  than  one  minute. 

2.  Guide  to  Operation, 

a.  Equipment  Required. 

(1)  IBM  compatible  PC  computer. 

(2)  3.5"  diskdrive. 

b.  Installation. 

\  (1)  Turn  on  the  computer  and  get  to  the  DOS  prompt. 

(2)  Insert  the  3.S"  disk  containing  the  PHCALC  model  into  your  computer's  disk  drive. 

y  (3)  From  the  DOS  prompt,  enter  the  command:  A'.PHCALC  (or  B:PHCALC  if  you're 
using  the  B:  disk  drive). 

c.  Definitions. 

(1)  Horizontal  shift  in  aimpoint  from  target  center,  in  centimeters:  If  the  firer's 
intended  aimpoint  is  to  the  left  of  the  target's  center,  the  user  should  input  the  number  of 
centimeters  to  the  left  as  a  negative  number  (e.g.,  *23).  If  the  firer's  intended  aimpoint  is  to  the 
right  of  the  target's  center,  enter  the  number  of  centimeters  to  the  right  as  a  positive  number  (e.g., 
12). 
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(2)  Vertical  shift  in  aimpoint  from  target  center,  in  centimeters:  If  the  Brer’s  intended 
aimpoint  is  below  the  target's  center,  enter  the  number  of  centimeters  from  the  center  as  a 
negative  number.  If  the  firer's  intended  aimpoint  is  above  the  target's  center,  enter  the  number  of 
centimeters  from  the  center  as  a  positive  number. 

d.  Operation. 

(1)  This  program  determines  the  probability  of  hit  on  a  rectangular,  two-dimensional 
target.  If  you  wish  to  obtain  the  probability  of  hit  on  a  target  composed  of  two  rectangles,  (a 
vehicle  consisting  of  a  hull  and  turret),  you  merely  need  to  keep  in  mind,  the  location  of  the  single 
aimpoint  on  the  target,  and  run  this  program  twioe-once  for  each  rectangle-and  manually  add 
the  two  resulting  probabilities. 

(2)  For  example,  assume  you  are  firing  a  missile  at  a  tank  300  meters  away.  Its  frontal 
measurements  are:  300  cm  wide  by  200  cm  high  for  the  hull,  and  200  cm  wide  by  100  cm  high 
for  the  turret.  Your  aimpoint  is  the  junction  between  the  turret  and  the  hull.  The  missile's  biases 
and  dispersions  are  shown  in  Screen  1.  Using  this  program  you  will  determine  the  separate 
probabilities  of  hit  for  the  turret  and  the  hull,  keeping  in  mind  that  your  aimpoint  for  both  ii  the 
turret  ring.  When  you  enter  the  data  for  determining  the  probability  of  hit  against  the  turret,  your 
vertical  shift  in  aimpoint  from  the  center  of  mass  is  -SOcm  because  your  actual  aimpoint  is  50  cm 
below  the  turret's  center  of  mass.  In  determining  the  probability  of  hh  fbr  the  hull,  you  must 
indicate  an  upward  shift  of +100  cm  from  the  hull's  center  of  mass.  Adding  both  probabilities  will 
give  you  the  probability  of  hit  against  the  target. 

(3)  Actual  prompts  and  sample  inputs  for  the  turret  are  shown  in  Screen  #1 . 


Horizontal  fixed  bias  (mils):?  0  ] 

Vertical  fixed  bias  (mils):?  0 

Select  one  of  the  following: 

1  >  Total  horizontal  ft  total  vertical  dispersions 
2 -Separate  variable  ft  random  error  dispersions. 

(Enter  1  or  2  from  the  keyboard)?  1 

Total  horizontal  dispersions  (mils):?  3 
Total  vertioai  dispersions  (mils):?  3 

Target  width  (centimeters):?  200 

Taryst  height  (centimeters):?  100 

Distance  to  targst  (maters):?  300 

Horizontal  shift  In  aimpoint  from  target  center  (cm):?  0 


(4)  The  following  result  will  be  displayed  on  your  screen. 


The  probability  of  hit,  P(H),  is  .2754702 


Screen  #2 

(5)  To  ran  the  program  again,  enter  A:PHCALC  or  B:PHCALC 


Horizontal  fixed  bin  (mils):?  0 
Vertical  fixed  bias  (mils):?  0 

Select  one  of  the  following: 

1  •  Total  horizontal  ft  total  vertical  dispersions 

2  •  Separata  variable  ft  random  error  dispersions. 


(Enter  1  or  2  from  the  keyboard)?  1 

• 

Total  horizontal  dispersions  (mils):?  3 
Total  vertical  dispersions  (mils):?  3 

Target  width  (centimeters):?  300 

Target  height  (centimeters):?  200 

Distance  to  target  (meters):?  300 

Horizontal  shift  in  aimpoint  from  target  oanter  (cm):?  0 

Vertical  shift  in  aimpoiat  from  target  center  (cm):?  100 


Screen  #3 

(6)  The  probability  of  hit  for  the  hull  will  be  displayed  as  follows: 


The  probability  of  hit,  P(H),  is  .4444504 


Screen  #4 

(7)  Adding  the  reeults  for  the  turret  (Screen  2)  and  the  hull  (Soreen  4)  gives  the 
probability  of  hit  on  the  tank. 

,2754702 

,7199206 
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e.  Explanation. 

(1)  This  model  uses  the  probability  density  function  for  a  random  variable  having  a 
normal  distribution:  J(x)  -  [l/(o^25f )]  (exp[-(x-/<)V(203)]},  where  pi  is  the  population  mean  and 
a  is  its  standard  deviation.  In  calculating  probability  of  hit,  the  fixed  bias  is  taken  as  the  mean, 
and  the  total  variable  biases  and  dispersions  are  taken  as  the  variance. 

(2)  Since  a  rectangular  target  has  two  dimensions,  width  and  height,  the  problem 
becomes  one  of  determining  the  joint  probabilities  of  hitting  the  target  within  its  horizontal 
boundaries  and,  simultaneously,  within  its  vertical  boundaries. 

(3)  To  keep  measurements  consistent,  the  biases  and  dispersions  are  converted  from 
mils  to  centimeters  on  target  at  the  given  range  to  the  target.  This  conversion  is  by.  the  equation 

200(Range  in  meters)tan[(0/2X.OOO98 175)] 

where  the  constant  .00098175  is  used  to  convert  mils  to  radians,  and  0  represents  the  mean  or 
standard  deviation  in  mils. 

(4)  Given  that  the  mean  and  standard  doviation  in  the  horizontal  direction  are 

converted  to  centimeters,  and  the  width  of  the  target  is  in  centimeters,  the  of  the 

horizontal  boundaries  of  the  target  are  transformed  to  standard  form  using  the  equation 

t  -  (x-m)/o.  The  probability  density  function  of  the  standard  normal  variable  then  becomes 

4>C0  *  (l/^Z5t  )exp(-*V2) 

(5)  Integration  of  the  probability  density  function  with  the  s-scores  as  the  limits  of 
integration  yields  the  probability  of  hit  in  the  horizontal  direction. 

(6)  The  same  process  is  used  to  determine  the  probability  of  hitting  the  target  within 
the  vertical  boundaries  of  the  target. 

(7)  Finally,  the  probability  of  hitting  the  target  within  the  horizontal  boundaries  is 
multiplied  by  the  probability  of  hitting  the  target  within  the  vertical  boundaries,  and  the  result  is 
the  probability  of  hitting  the  target. 

(8)  The  trapezoidal  method  of  integration,  with  100  intervals,  is  used  in  this  model. 
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C.  PROBABILITY  OF  HIT  PLOTTING  MODEL 


1.  Introduction. 

a.  Description.  This  model,  developed  by  Mr.  John  B'Errico,  plots  the  hits  for  .direct  fire, 
single  shot  weapons,  against  a  point  target.  Inputs  required  are  the  weapon's  biases  and 
dispersions,  the  target  dimensions,  the  range  to  the  target,  and  the  number  of  iterations  Q.e„  the 
number  of  single  shots  to  be  plotted).  The  results  are  displayed  graphically  on  the  soreen. 

b.  Limitations.  This  program  plots  the  strike  of  each  bullet  relative  to  a  two-dimensional 

target. 


o.  Applications.  Desktop  analytic  tool  for  studying  weapon  accuracies.  In  effect,  this 
model  is  a  graphic,  stochastic  version  of  the  PHCALC  probability  of  hit  model. 

d.  Setup.  This  model  runs  on  a  DOS-based  PC  computer.  Data  can  be  entered  into  the 
model  in  a  few  minutes,  and  results  are  displayed  in  less  than  one  minute. 

2.  Guide  to  Operation. 

a.  Equipment  Required. 

(1)  IBM  compatible  PC  computer. 

(2)  3.5"  disk  drive. 

b.  Installation. 

(1)  Turn  on  the  computer  and  get  to  the  DOS  prompt. 

(2)  Insert  the  3.5"  disk  containing  the  PHCALC  model  into  your  computer's  disk  drive. 

(3)  The  process  for  printing  the  display  with  a  printer  depends  on  the  version  of  DOS 
being  used,  the  type  keyboard,  and  the  type  of  printer,  but  you  must  prepare  for  it  now. 

(a)  For  DOS  5.0,  type  the  command  GRAPHICS  GRAPHICS  (the  word 
"graphics,"  typed  twice,  separated  by  a  space)  from  the  DOS  prompt,  before  running  this 
program.  With  an  enhanoed  keyboard,  pressing  the  [Print  Screen]  key,  or  the  [Shift]  +  [Print 
Screen]  keys,  should  print  the  display  on  your  printer. 

(b)  If  your  version  of  DOS  is  older  than  5.0,  you  should  type  the  command 
GRAPHICS  at  the  DOS  prompt  before  running  this  program.  If  you  have  an  unenhanced 
keyboard,  the  keys  [Shift]  +  [Prt  Sen]  should  print  the  screen  on  the  printer. 
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(4)  From  the  DOS  prompt,  enter  the  command:  A:PHPLOT  (or  B:PHPLOT  if  you're 
using  the  B:  disk  drive). 

c.  Operation.  * 

(1)  This  model  places  a  target  oh  the  screen,  scaled  to  the  height  and  width  inputs,  and 
then  displays  the  impact  of  each  round  in  the  target  area,  according  to  the  weapon's  biases  and 
dispersions  and  range  to  the  target. 

(2)  Program  prompts  and  sample  inputs  are  shown  below,  in  screen  #1. 

Eater  the  horizontal  fixed  bias  (mils)  of  the  weapon  system?  0 
Enter  the  vertical  fixed  bias  (mils)  of  the  weapon  system?  0 
Total  horizontal  variable  biases  ft  dieperdoos  (mils)?  3 
Total  vertical  variable  biases  &  dispersions  (mils)?  3 
Enter  the  weapon-target  range  in  meters?  300 
Enter  the  height  of  the  target  in  meters?  2 
enter  the  width  of  the  target  in  meters?  i 
Enter  the  number  of  single  ids  to  be  fired?  200 

\  / 

Screen  #1 

(3)  Upon  entering  the  last  input,  a  result  similar  to  the  one  shown  on  the  following 
page  will  appear  on  the  screen.  The  display  remains  on  tho  screen  until  any  key  is  pressed,  in 
case  the  user  wishes  to  print  the  display  on  his  printer.  Pressing  any  key  (except  the  print  screen 
keys)  will  clear  the  screen,  and  return  the  user  to  the  DOS  prompt. 

(4)  To  ran  the  program  again,  enter  A:PHPLOT  or  B:PHPLOT,  whichever  is 
appropriate. 

d.  Explanation.  Given  the  biases  and  dispersions,  this  model  samples  from  a  normal 
probability  distribution  for  the  accuracy  of  each  round,  then  determines  the  impact  point  based  on 
the  *ange  to  the  target.  The  method  us^d  to  generate  normally  distributed  (pseudo)  random 
numbers  was  proposed  by  Marsaglia  and  Bray  in  1964.  (Reference  8) 
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D.  FORCE  EFFECTIVENESS  INDICES  MODEL 
1'.  Introduction. 

« 

a.  Description.  This  model,  from  the  TATAWS  study  (Reference  7),  programmed  and 
modified  by  Mr,  John  DErrico,  determines  the  value  of  each  weapon  system  in  a  wargame,  based 
on  input  from  the  killer-victim  scoreboards.  In  general,  the  value  of  a  weapon  in  a  wargame  is 
based  on  the  values  and  quantities  of  opposing  forces  killed  by  that  weapon. 

b.  Limitations.  Force  effectiveness  indices  are  not  commonly  used,  and  as  such,  are  not 
fiuniliar  to  decision  makers. 

c.  Applications.  Desktop  analytic  tool  for  evaluating  the  effectiveness  of  a  weapon  within 
a  wargame. 

d  .  Setup.  This  mode)  runs  on  an  IBM  compatible  PC  computer.  Data  can  be  entered 
into  the  model  in  a  few  minutes,  and  results  are  printed  in  less  than  one  minute. 

2.  Guide  to  Operation. 

a.  Equipment  Required. 

(1)  IBM  PC  compatible  computer. 

(2)  3.5M  disk  drive.  ‘ 

(3)  Dot  matrix  Printer. 

b.  Installation. 

(1)  Turn  on  the  computer  and  get  to  the  DOS  prompt. 

(2)  Insert  the  3.5"  disk  containing  the  FEI2  model  into  your  computer's  disk  drive. 

(3)  Turn  on  your  printer,  and  make  sure  that  it  is  "on  line." 

(4)  From  the  DOS  prompt,  enter  the  command.  A:FEI2  (or  B:FEI2  if  you're  operating 
from  a  B:  disk  drive). 

c.  Operation, 

(1)  Essentially,  you  will  be  asked  to  enter  the  names  of  Red  and  Blue  weapon  systems, 
and  data  from  a  killer-victim  scoreboard.  Whether  you  are  conducting  a  trial  run  or 
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not,  you  should  save  your  data  via  main  menu  item  #3 -it  will  keep  you  from  being  frustrated  in 
cas$  you  exit  the  program  unintentionally  (power  interrupt,  etc...)  and  have  to  enter  the  data  all 
over  again.  Also,  if  you  make  a  mistake  while  entering  data,  continue  to  enter  the  remainder  of 
the  data  since  you  vdil  be  able  to  make  any  changes  when  you're  done. 

(2)  The  prompts  you  will  see,  and  sample  rosponses,  are  shown  on  the  following 
facsimile  screens. 

(3)  The  first  menu,  also  called  the  main  menu,  is  as  follows 

r - - - - - \ — \ 

1  -  Enter  Data 

2  -  Change  Data 

3  •  Save  Data 

4  -  Perform  Computations  &  Print  Results 

5  -  Quit 

(Enter  one  of  the  above  numbers) 

1 

■ 

Screen  #1 

(4)  Entering  the  number  1,  in  Screen  #1,  leads  to  the  next  menu. 

/ - 

1  -  Enter  Data  From  Keyboard 

2  -  Enter  Data  From  Disk 

3  •  Return  to  Main  Menu 
(Enter  one  of  the  above  numbers) 

?  1 

v 

Screen  #2 


■\ 
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(5)  Having  selected  the  method  for  entering  data,  you  will  next  be  asked  to  enter  the 
title  for  this  case.  The  response  in  this  example  is  "Test  Case  #1,  22  Dec  92,  John  D'Errico.1' 


ENTER  TITLE  OF  GAME.  Test  Case  #1,  22JDec  92,  John  D'Errico 


Screen  #3 

(6)  You  will  now  be  asked  to  enter  the  number  of  Blue  weapon  types,  followed  by  the 
name  of  each  $lue  weapon  type.  For  example,  the  killer-victim  scoreboard  might  show  three 
types  of  weapon  systems:  tanks,  Bradley  Fighting  Vehicles,  and  Improved  TOW  vehicles. 
Therefore,  you  would  enter  3  for  the  number  of  Blue  types,  followed  by  the  name  of  each  type. 
When  entering  the  names  of  the  weapon  systems,  try  to  use  no  more  than  five  or  six  characters, 
such  as  Ml  A1  or  BFV-1  or  TANK1;  otherwise,  the  printout  becomes  too  crowded  and  difficult 
to  read. 


(7)  It  is  suggested  that  you  enter  every  weapon  system  on  the  killer-victim  scoreboard. 
Later  on,  this  program  will  allow  you  to  choose  those  Blue  and  Red  weapons  you  wish  to  have 
included  in  the  force  effectiveness  ratios. 


( - : - 

ENTER  NO.  OF  BLUE  WEAPON  TYPES.  ?  3 

A 

ENTER  THE  NAME  OF  BLUE  WEAPON  #  1 
?TANK 

ENTER  THE  NAME  OF  BLUE  WEAPON  #  2 
?  BFV 

ENTER  THE  NAME  OF  BLUE  WEAPON  #  3 
?  KMMWV 

\ _ _ _ 

Screen  #4 


(8)  Similarly  for  the  Red  weapon  types 


( - 

ENTER  NO.  OP  Red  WEAPON  TYPES,  ?  3 

ENTER  THE  NAME  OF  RED  WEAPON  #.  1 
?  TANK 

* 

ENTER  THE  NAME  OF  RED  WEAPON  #  2 
?  BMP 

ENTER  THE  NAME  OF  RED  WEAPON  #3 
?  BRDM 

v. _ 

_ J 

Screen  #5 


(9)  Now  you  will  be  asked  to  fill  in  the  data  from  the  killer-victim  scoreboard.  Read 
each  prompt  carcfUUy,  and  you  should  have  no  trouble  entering  the  correct  data,  lfvou  make  a 
mistake,  keep  going  with  the  correct  data.  You  will  be  able  to  make  corrections  later  by 
selecting  the  "Change  Data"  item  from  the  main  menu.  Keen  In  mind  that  tl 


r 

^1 

ENTER  NO.  OF  RED  TANK  KH1BD  BY  BLUE  TANK 

?  3 

• 

ENTER  NO.  OF  RED  BMP  KILLED  BY  BLUE  TANK 

?  4 

ENTER  NO.  OF  RED  BRDM  KILLED  BY  BLUE  TANK 
?  2 

ENTER  NO.  OF  BLUE  TANK 
[76 

A 

Screen  #6 


(10)  Similar  displays  will  request  the  remainder  of  the  Blue  vs  Red  and  Red  vs  Blue 
results,  as  fallows. 


(11)  Red  vehicles  killed  by  Blue  BFV: 


(12)  Red  vehicles  killed  by  Blue  HMMWV: 


(13)  Blue  vehicles  killed  by  Red  tank: 
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(14)  Blue  vehicles  killed  by  Red  BMP: 


f - 

BNmMO.  or  lUJITAMCiaUSDBYMDSMr 

73 

• 

\ 

INTO.  NO,  Of  BLUB  BFV  COXED  BY  UD  BMP 

73 

• 

SOTIBNO.  Of  KMHMMWVXIUaDBYIIDMff 
?  1 

i 

INIBftNO.OfUDBMF 

i 

\ 

M J 

7  12 

_ 

Screen  #10 

(15)  Blue  vehicles  killed  by  Red  BRDM: 


BMTONO.  OP  BLUB  TANK  XUUOBV  UD  MOM 

7  1 

BKlISNO.OPIUJBIfVnLUDSVIBaUM 

7  3 

swim  no.  or  sum  hmmwv  kujd  n  ud  hdm 
70 

BNIWNO.OflUBBltDM 

74 


Screen  #1 1 

(16)  The  program  now  returns  to  the  main  menu.  (SAVE  YOUR  DATA!) 


( 1 7)  Save  the  data  immediately  alter  data  entry.  You  could  always  change  it  and  then 
save  it  again  later.  Saving  it  as  soon  as  possible  will  preclude  having  to  re-enter  all  the  data  in 
case  of  a  mishap.  When  you  select  item  #3,  abo\  to  save  the  data,  you  will  be  prompted  for  a 
drive  and  filename.  You  may  save  the  data  to  any  drive  and  any  normal  filename  (beginning  with 
an  alphabetic  letter  and  having  no  more  than  eight  letters  and  numerical  digits,  with  no.  spaces  in 
it).  Remember,  if  you  want  the  data  saved  in  a  particular  directory  on  the  c:  drive,  you  must 
specify  the  entire  path  in  the  filename.  For  example,  if  you  want  to  save  your  data  on  your  fixed 
disk,  in  an  ORSAVBFV-COEA  directory,  with  a  file  name  ofBFV-RUN3.DAT,  then  your  file 
name  would  be  C:\ORSA\BFV-COEA\BFV-RUN3.DAT.  For  saving  your  data  to  a  floppy  disk, 
usually  something  like  A:BFV-RUN3.DAT  is  sufficient,  since  most  people  don't  create  different 
directories  on  their  floppy  disks.  You  may  want  to  use  a  different  floppy  for  data. 


Screen  #13 


(IP)  The  data  will  now  be  saved,  and  the  main  menu  will  reappear. 


1  -  Enter  Data 


2  -  Change  Data 


3  -  Save  Data 


A  -  Perform  Computations  &  Print  Results 
5 -Quit 

(Enter  one  of  the  above  numbers) 
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(19)  Item  #2,  to  change  data,  was  selected  in  this  example  simply  to  display  the  change 
data  menu  which  appears  below. 


Screen  #16 


(21)  Selecting  item  #4,  above,  does  not  immediately  initiate  the  calculations  and 
printing  of  results.  The  user  is  first  given  an  opportunity  to  select  whether  or  not  standard  force 
effectiveness  ratios  should  be  included  in  the  results  (force  exchange  ratio,  loss  exchange  ratio, 
system  exchange  ratio,  percent  system  contribution,  and  percent  force  remaining,  in  addition  to 
this  model's  force  effectiveness  indicators.  See  page  19  for  definitions. 


(22)  Select  type  of  results  desired. 

/ - N 

1  •  Print  Standard  Effectiveness  Ratios 

2  •  Do  Not  Print  Standard  Effectiveness  Ratios 
(Enter  1  or  2) 

?  1 

mm mmm* 

Screen  #17 

(23)  Before  doing  the  calculations  and  printing  the  results,  the  user  is  given  the  chance 
to  select  the  types  of  forces  to  be  included  in  the  calculations  and  force  effectiveness  ratios. 
Consequently,  the  list  of  Blue  forces  will  be  displayed,  followed  by  the  list  of  Red  forces,  and  the 
user  selects  the  forces  to  be  counted  in  foe  resulting  ratios. 

(24)  Select  forces  to  be  included  in  computation  of  results. 

— 

1  TANK  1 

2  BFV  1 

3  HMMWV  I 


Enter  the  numbers  (one  at  a  time,  pressing  the  enter  key  after  every  selection)  you  I 
want  included  in  the  standard  force  ratios.  ,  I 

Enter  *9  when  all  aeleotiona  have  been  made.  I 

Enter  •!  to  select  all  the  items.  I 


Screen  #18 


(25)  After  making  the  Blue  weapon  system  selections,  the  list  of  Red  weapon  systems 
will  be  displayed,  as  in  Screen  #19. 


/ — ; 

1  TANK 

2  BMP 

3  BRDM 


Eater  the  numbers  (one  at  a  time)  you  want 
Included  In  the  standard  force  ratioa. 

Enter  >9  when  all  aoiections  have  been  made. 
Enter  -1  to  select  all  the  items. 


Screen  #19 


(26)  Having  made  the  Blue  force  and  Red  force  selections,  the  program  will  * 
automatically  perform  the  calculations  and  print  the  results.  After  the  results  have  been  printed, 
the  main  menu  will  be  displayed.  Selecting  ttS”  to  quit  the  program  will  also  send  the  necessary 
control  oodes  to  your  printer  to  return  it  to  normal  (after  printing  the  results  in  small  print). 

(27)  The  following  printout  is  a  result  of  the  inputs  used  in  the  above  example. 
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DESCRIPTION:  Taat  Cut  #1,  22  D«  92,  John  D'Errioo 
FILEN/iME;  AjFE1-TEST.DAT 

KILLER  1  VICTIM  MATRICES 


RES  VICTIM 

BLUE 


KILLER 

NUM 

TANK 

BMP 

BRDM 

TOTAL 

TANK 

4.0 

3.0 

4.0 

2.0 

9.0 

BPV 

12.0 

3.0 

3.0 

3.0 

9.0 

HMMWV 

2.0 

0.0 

2.0 

0.0 

2.0 

BLUE  VICTIM 


RID 


KILLER 

NUM 

TANK 

b  rv 

HMMWV 

TOTAL 

TANK 

13.0 

3.0 

4,0 

0,0 

4.0 

BMP 

13.0 

3.0 

3.0 

1.0 

7.0 

BRDM 

4.0 

1.0 

3.0 

0.0 

4.0 

TOTAL  VALUE  OP  BLUE-  0.7314 
TOTAL  VALUE  OP  RID  -  0.1773 
FORCE  EPPECTTVENESI  RATIO  (PER)  ■ 
(Mai  BhM  ValuaHMal  Rad  Valua) 


STANDARD  LER  - 
STD  BLUE  PPR  - 
0.1333  STD  RID  PPR  - 

STANDARD PER  • 


1.1743  (Radi  KUMMBIum  Klllad) 

0.1300 

0.3S37 

0J40S 


PEI  VALUES  STD.  SIR  VALUES  FRACTIONAL  PARTICIPATION  INDICES 


(Valua  cf  ana  wa*on)  LETHALITY  SURVIVABILITY 

WEAPON  SYSTEM  SIR  PRC 


TANK 

0.0347 

TANK 

0.0340 

TANK 

1.3000 

0.4100 

1.3000 

0.0000 

BPV 

0.0134 

BMP 

0,0331 

RPV 

I.3S37 

0:4340 

0.7300 

1.1111 

HMMWV 

0.0174 

BRDM 

0.0443 

HMMWV 

0.3000 

0.1000 

1.0000 

1.1113 

d.  Definitions. 


Standard  Lore  Intony  Ratio  (LER):  (Rad  LoaanyOMua  Loam). 

(Rad  Lonny(Rad  labial  Straogth) 

Standard  Farea  laahama  Ratio  (PER):  . . . . . 

(Blua  LsaHayBlua  Initial  Btmfh) 

Standard  Spaaiflo  Eaobanpa  Ratio  SBR):  (Rad  Lotwa  Front  Spaolflo  Bluo  Syatare/(Spadfla  Blua  Syuaa  Loataa) 

Standard  Paroaol  Syatam  Ccotrlbutioa  (Rad  Loaaaa  [Am  to  SpaoiOc  Blua  Sy»u«ny(ToU]  Rad  Umm) 

Parent  Foroa  Ramafatiag:  (Total  Nunbar  of  Blua  Surviveny(Tolal  loRial  Nunbar  oT  Blua  Foreaa) 

L*halby:  (Rad  Loana  by  Spaeifie  Blua  Sy«amy(Utial  Nunbar  of  Spaeifie  Blua  SyMatm) 

SurvtvabUhy:  ((Blua  Bytfan  SurvivonyfTottl  Blua  Burvivon))/((lnilUI  Blua  SyMreaVO'otal  Initial  Blua  Foroa)) 

Total  Valua  of  Blua:  Sun  of  Una  vahra  eftaeh  Blua  weapon  tlnai  tba  initial  number  of  that  Bluo  weapon. 

Total  Valua  of  Rad:  Sun  of  lb*  valua  of  aacb  Rad  weapon  tbura  tba  initial  oumbar  of  that  Rad  wetpeo. 
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E.  SINGLE  SHOT  BURSTING  MUNITIONS  MODEL 
1/  Introduction. 

k 

a.  Description.  This  computer  model  was  developed  by  Mr.  John  D'Errico,  Dismounted 
Warfighting  Battle  Laboratory,  U.S.  Army  Infantry  School,  Fort  Banning,  Georgia.  It  displays 
the  results  of  firing  one  or  more  bursting  (exploding)  munitions  from  a  single-shot  weapon,  such 
as  the  M203  grenade  launcher,  at  an  area  target.  Personnel  in  the  target  area  may  be  deployed  in 
a  line,  file,  column,  or  wedge  formation  Inputs  required  are:  the  biases  and  dispersions  of  the 
weapon;  die  projectile  velocity;  the  weapon-target  range;  radius  of  damage;  number  of  single 
rounds  to  be  fired  at  the  target;  and  the  number,  spacing,  and  formation  of  personnel  in  the  target 
area. 


b.  Limitations.  The  targets  depicted  in  this  model  are  stationary,  standing,  two 
dimensional,  personnel  targets. 

c.  Applications.  Desktop  analyses  involving  small  arms,  small  arms  munitions,  and  their 
effects  on  personnel  area  targets. 

d.  Setup.  This  model  runs  on  any  IBM  compatible  PC  computer.  Run  time  depends  on 
the  number  of  iterations  desired,  with  one  to  fifteen  minutes  being  typioal.  Each  iteration  takes  . 
about  ono  second. 

2.  Guide  to  Operation. 

a.  Equipment  Required. 

(1)  IBM  compatible  PC  computer. 

(?.)  3.5"  disk  drive. 

(3)  Printer  (optional). 

b.  Installation. 

(1)  Turn  on  the  computer  and  get  to  the  DOS  prompt. 

(2)  Insert  the  3.5"  disk  containing  the  SSBURST  model  into  your  disk  drive. 

(3)  See  paragraph  C.2.b.(3),  Probability  of  Hit  Plotting  Model,  for  printing  graphics. 

(4)  From  the  DOS  prompt,  enter  the  command:  A: SSBURST  (or  B: SSBURST  if 
you're  using  the  B:  disk  drive), 
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c.  Definitions.  For  this  model,  one  "iteration"  refers  to  firing  one  set  of  rounds  against  the 
target.  For  example,  if  the  number  of  rounds  to  be  fired  at  the  target  is  four,  then  each  iteration 
will  fire  four  rounds  at  the  target.  For  trial  purposes,  five  or  ten  iterations  is  sufficient  to  see  the 

model  work.  For  more  accurate  results,  200  to  1000  iterations  is  recommended. 

* 

d.  Operation. 

(1)  You  will  be  prompted  for  input.  The  first  prompt  will  ask  you  to  enter  the 
horizontal  fixed  bias  of  the  weapon  system.  Entering  a  zero  indicates  that  the  weapon  has  been 
zeroed  for  the  range  to  foe  target.  Biases  and  dispersions  are  in  mils. 

• 

(2)  The  next  prompt  will  ask  you  to  enter  the  vertical  fixed  bias  of  the  weapon  system. 
Entering  a  zero  indicates  that  the  weapon  has  been  zeroed  for  foe  range  to  foe  target. 

(3)  A  total  of  ten  prompts  will  appear  on  the  screen,  and  you  must  enter  a  response  for 
each  one.  Biases  and  dispersions  are  in  mils,  and  distances  are  in  meters.  All  ten  prompts  and 
sample  responses  are  shown  below. 

(4)  Keep  in  mind  that  the  wedge  formation  was  constructed  for  nine  personnel  only.  If 
you  plan  to  select  a  wedge,  enter  a  9  in  foe  eighth  prompt  below.  The  other  formations  can 
accept  any  number  of  personnel. 


(7)  Select  formation. 


Target  formation: 

,1  -  Line 

2  -  Column 

3  -  Wedge  (9-man  squad) 

4  -  File 

(Enter  1  -  4)?  2 


(8)  After  you  input  the  number  of  iterations,  the  model  will  begin  to  graphically  display 
each  iteration's  results,  one  iteration  at  a  time. 


(9)  Each  iteration's  results  are  displayed  on  the  screen,  for  a  brief  time.  When  the  last 
iteration  hu  been  completed,  the  picture  will  remain  on  the  screen  until  you  either  print  the 
screen  to  a  printer,  or  press  any  other  key  to  return  to  the  MS-DOS  prompt. 


(10)  You  will  automatically  get  a  printout  of  reeuftt,  showing  how  each  round  did  i 
against  each  target.  The  printout  will  include  the  number  of  the  round,  the  number  of  the  target, 
the  effect  of  each  round  on  the  target  area,  the  average  number  of  targets  killed  by  each  round, 
and  the  average  results  for  the  cumulative  effect  of  all  rounds.  For  1000  iterations,  the  results 
are  highly  repeatable. 


(1 1)  Although  you  may  select  practically  any  number  of  personnel  For  a  line,  column, 
or  file  formation,  the  wedge  currently  applies  to  only  nine  personnel.  Keep  in  mind  that  the  scale 
of  the  display  on  the  screen  depends  on  the  number  of  personnel  in  the  target  area  and  their 
separation  distance.  Choosing  a  large  number  of  personnel  separated  by  10  meters  will  make  the 
personnel,  and  possibly  the  bursting  radius,  very  small  or  invisible. 


(12)  The  circle  which  represents  the  bursting  radius  on  the  screen  may  appear  to 
enclose  a  target  without  killing  it  (killed  targets  are  shown  as  solid  white  squares).  This  is 
because  the  screen's  vertical-to-horizontal  scale  may  not  allow  a  circle  to  look  like  a  circle. 
Sometimes  the  bursting  radius  circle  will  appeer  as  an  oval,  or  etlipee.  The  mathematics, 
however,  are  correct;  and  all  targets  within  the  bursting  radius  are  killed. 

(13)  A  complete  example,  from  prompts  and  responses  to  results,  follows. 
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(14)  Opening  screen: 


This  program  was  authored  by: 
John  D'Errico 

Dismounted  Warfighting  Battle  Lab 
U.S.  Army  Infantry  School 
Fort  Benning,  GA  31905 
(706)  545-761 1/7000 
DSN  835-7611/7000 


(Press  the  [Enter]  key  to  continue)? 


s 


Screen  #3 


i 

I 


i 

i 

i 

i 

i 

j 

i 


(15)  Description  of  inputs: 


This  program  will  require  you  to  enter  the  following: 
Horizontal  and  vertical  fixed  biases  (zeroes  if  the 
weapon  is  assumed  to  be  zeroed  on  the  target). 

Total  horizontal  variable  biases  and  dispersions. 

Total  vertical  variable  biases  and  dispersions. 

Muzzle  or  average  projectile  velocity. 

Range  to  the  target  area. 

Bursting  munition's  radius  of  damage. 

Number  of  personnel  in  the  target  area. 

Separation  distance  between  personnel. 

The  number  of  single  rounds  to  be  fired. 

The  target  formation:  line,  column,  wedge  or  file 
(not  applicable  to  a  single  person  point  target). 
Number  of  iterations  (not  applicable  to  point  targets.) 


The  last  picture  plotted  on  the  screen  remains  until  you 
press  a  key,  in  case  you  want  to  first  print  it  with  [PrtSc]. 


(PRESS  THE  ENTER  KEY  TO  BEGIN  THE  PROGRAM/INPUTS)? 


Screen  #4 
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(16)  Select  printer  option: 


Screen  #5 


(17)  Inputs: 


Enter  the  horizontal  fixed  bias  of  the  weapon  system?  0 
Enter  the  vertical  fixed  bias  of  the  weapon  system?  0 
Enter  the  total  horizontal  variable  biases  and  dispersions?  10 
Enter  the  total  vertical  variable  biases  and  dispersions?  10 
Enter  the  projectile  velocity  in  meters  per  sec?  60 
Enter  the  weapon-target  range  in  meters?  250 
Enter  the  radius  of  damage  in  meters?  5 
Enter  the  number  of  personnel  in  the  target  area?  1 1 
Enter  the  space  between  personnel?  5 
Enter  the  number  of  single  rds  to  be  fired?  4 


Screen  #6 


(18)  Target  formation: 


SAMPLE  PRINTOUT  OF  THE  SSBURST  PROGRAM 


H.Biasl  0 

V.Bias!  0 
(i.Disp:  10 

U.Disp:  10 

Oeloc :  60 

Range!  250 

Dawage:  5 

Tgts:  11  .  / 

Rounds!  4  0  Q 

a 

-  <f 

KILLS!  6 

70 

.  "  '•g— 1 ■ 

-70 

( 

70 

Targets  killed!  6 

Kill  ratio!  ,5454546 

-70 
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(20)  The  following  results  are  based  on  the  example  above.  Since  only  1 0  iterations 
were  used,  you  can  expect  substantially  different  results  if  you  run  the  same  example. 

NUMBER  OF  ITERATIONS:  10 


ROUND 

TGT  NUM  KILLED 

AVG/TTERATION 

1 

1 

4 

*  0.400 

1 

2  • 

4 

0.400 

1 

3 

4 

0.400 

1 

4 

3 

0.300 

1 

5 

2 

0.200 

1 

6 

1 

0.100 

1 

7 

0 

0.000 

1 

8 

1 

0.100 

1 

9 

0 

0.000 

1 

10 

0 

0.000 

1 

11 

0 

0.000 

1 

ALL 

19 

1.900 

2 

1 

1 

0.100 

2 

2 

2 

0.200 

2 

3 

1 

0.100 

2 

4 

2 

0.200 

2 

5 

4 

0.400 

2 

6 

4 

0.400 

2 

7 

3 

0.300 

2 

8 

1 

0.100 

2 

9 

0 

0.000 

2 

10 

I 

0.100 

2 

11 

0 

0.000 

ALL 

19 

1.900 

3 

1 

1 

0.100 

3 

2 

0 

0.000 

3 

3 

1 

0.100 

3 

4 

2 

0.200 

3 

5 

1 

0.100 

3 

6 

5 

0.600 

3 

7 

1 

0.100 

3 

8 

4 

0.400 

3 

9 

1 

0.100 

3 

10 

1 

0.100 

3 

11 

1 

0.100 

3 

ALL 

18 

1.800 

4 

1 

0 

0.000 

4 

2 

0 

0.000 

4 

3 

0 

0.000 

4 

4 

0 

0.000 

4 

5 

1 

0.100 

4 

6 

0 

0.000 

4 

7 

3 

0.300 

4 

8 

2 

0.200 

4 

9 

3 

0.300 

4 

10 

3 

0.300 

4 

11 

2 

0.200 

4 

ALL 

14 

1.400 

CUMULATIVE 

70 

7.000 
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(21)  Positioning  of  personnel  in  the  various  formations  is  according  to  the  following 
format.  The  numbers  indicate  the  actual  number  assigned  each  person  in  the  target  area,  and 
match  the  numbers  referred  to  in  the  printout  of  results. 


Line 


6  5  4  3  2  1 


Column 


11 

9 

7 

5 

3 

1 


10 

8 

6 

4 

2 


Wedge  FUe 

; 

6 

7  5 

4 

9  6  3 

2 


2  4 


1 
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F.  ANALYTIC  HIERARCHY  PROCESS  MODEL 


1. *  Introduction. 

% 

a.  Description.  This  program  was  authored  by  Mr.  John  D'Errico,  U.S.  Army  Infantry 
School.  The  Analytic  Hierarchy  Process  (AHP)  was  developed  by  Thomas  L.  Saaty  in  the  early 
1970's.  It  is  a  method  for  ranking  a  set  of  alternatives  based  on  multiple  levels  of  character¬ 
istics.  For  example,  performance  and  cost  may  be  two  characteristics  on  one  level,  and  they 
might  each  consist  of  several  other  characteristics  on  a  lower  level.  In  turn,  each  of  these 
characteristics  could  be  further  defined  by  characteristics  on  even  lower, levels.  Each  character¬ 
istic's  value  may  be  based  on  physical  data  such  as  seconds,  inches,  pounds,  dollars,  probability  of 
hit,  etc...,  or  on  subjective  evaluations.  The  AHP  can  also  assist  the  user  when  developing 
subjective  values. 

b.  Limitations.  This  model  is  primarily  intended  for  first-time  users  of  Saatys  Analytic 
Hierarchy  Process.  It  is  considered  more  as  a  tutorial  which  will  enable  the  user  to  make  an  easy 
transition  to  the  use  of  a  spreadsheet  program  such  as  Lotus  1-2-3.  Spreadsheet  software  would 
be  much  fester  and  more  flexible  for  a  complex  AHP  analysis. 

c.  Applications.  The  AHP  has  been  applied  to  a  large  variety  of  problems  in  the  areas  of' 
education,  management  of  energy,  political  candidacy,  transportation  planning,  and  others.  It  has 
also  been  in  use  at  the  Pentagon.  At  the  Infantry  School  the  AHP  was  used  in  the  combat  boot 
analysis,  multipurpose  bayonet  analysis,  and  TOW  warhead  improvement  analysis  and  selection. 

d.  Setup.  Mr.  John  D’Errico  has  developed  two  BASIC  language  programs  for  the 
Analytic  Hierarchy  Process.  These  programs  will  run  on  any  IBM  compatible  PC.  Data  sorting 
and  transformations  usually  take  one  or  two  days.  Runs  can  occur  at  the  rate  of  one  every  ten 
minutes.  Lotus  1-2-3  can  also  be  used  to  run  the  AHP,  in  which  case  the  user  gains  much 
flexibility  and  speed  in  sensitivity  analyses  and  run  time. 

2.  Guide  to  Operation. 

a.  Equipment  Required. 

(1)  IBM  PC  compatible  computer. 

(2)  A  3.5"  disk  drive. 

(3)  A  printer. 

(4)  GWB  ASIC  .  This  programing  language  can  usually  be  found  on  the  DOS  disks  if 
you  have  a  DOS  version  earlier  than  5.0.  It  is  also  provided  on  the  modeling  disk.  If  using  DOS 
version  5.0  or  more  recent,  use  the  GWBASIC  on  the  modeling  disk.  (A:  GWB  ASIC) 
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b.  Installation. 


(1)  Turn  on  the  computer  and  get  to  the  DOS  prompt. 

(2)  If  you  are  using  the  a:  drive,  enter  the  command  A:GWBASIC 

(3)  You  will  know  that  GWBASIC  has  been  loaded  when  you  see  a  screen  with  the 
OK  prompt  at  the  top  and  the  ten  Amotion  keys  along  the  bottom. 

(4)  Enter  the  command  LOAD"A:  AHP  (You  will  receive  another  "OK"  prompt). 

’  i 

(5)  Enter  the  command  RUN 

(6)  You  will  now  see  the  prompting  messages,  and  requests  for  data,  according  to  the 
facsimile  screens  shown  at  the  end  of  this  section. 


o.  Example.  The  fbllowing  example  shows  the  mechanics  of  the  AHP  process  and  it 
ahould  help  to  explain  both  the  process  itself  and  the  terminology  associated  with  it.  It  will  also 
serve  as  a  basis  for  describing  some  of  the  praotioal  applications  in  which  the  AHP  has  been  used, 
and  the  various  ways  of  setting  up  the  AHP  to  At  the  problem  at  hand.  This  example  assumes 
thrt  there  are  three  alternatives  (ALT1,  ALT2,  ALT3)  and  five  characteristics  (CHARI,  CHAR2, 
CHAR3,  CHAR4,  CHAR5)  which  will  be  used  to  evaluate  the  alternatives. 


(1)  STBP1.  Compare  each  characteristic  to  every  other  characteristic.  Comparative 
values  or  weights  may  be  based  on  either  real  data  suoh  as  seconds,  pounds,  foot,  or  dollars,  or 
based  <>n  subjective  determinations.  A  matrix  for  these  pairwise  comparisons  of  Characteristics 
would  bo  set  up  as  follows. 

CHARI  CHAR2  CHAR3  CHAR4  CHARS 
CHARI  1.00 
C11AR2  1,00 

CHAR3  1.00 

CHAR4  1.00 

CHARS  1.00 


(a)  The  l'a  on  the  main  diagonal  indicate  that  eaoh  characteristic  is  equal  to  itself  in 
importance.  To  fill  in  the  remainder  of  the  matrix,  ask  yourself  how  much  more  important  or 
better  is  the  item  in  the  left  column  than  the  item  across  the  top  row.  For  the  use  of  subjective 
data,  Saaty  recommend*  a  scale  of  one  to  nine,  where  the  number  1  indicates  equality,  and  three, 
five,  seven,  and  nine  indicate  that  the  item  on  the  left  is  weakly  more  important,  strongly  more 
important,  demonstrably  more  important,  and  absolutely  more  Important  than  foe  hem  across  the 
top.  In  this  example,  we  assume  that  we  have  physical  measurements  which  we  are  comparing. 
Accordingly,  we  know  that  CHARI  is  five  times  better  than  CHAR2,  three  times  better  than 
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CHAR3,  three  times  better  than  CHAIM,  and  nine  times  better  than  CHARS.  Adding  these 
comparative  values  to  the  matrix  results  in  the  following. 

CHARI  CHAR2  CHAR3  CHAR4  CHARS 


CHARI  1.00 

5.00 

3.00 

3.00 

CHAR2 

1.00  • 

CHAR3 

1.00 

CHAIM 

1.00 

CHARS 

(b)  After  each  row  is  filled,  the  reciprocal  of  each  number  in  the  row  is  entered  in 
the  symmetrically  opposite  cell  aoross  the  main  diagonal.  For  example,  since  the  intersection  of 
CHARI  and  CHAR3  is  a  3,  meaning  CHARI  la  three  times  better  than  CHAR3,  then  the 
intersection  of  CHAR3  and  CHARI  is  1/3,  or  0.33,  meaning  CHAR3  is  one-third  as  good  as 
CHARI,  as  follows. 


CHARI  CHAR2  CHAR3  CHAR4  CHARS 


CHARI 

1.00 

5.00 

3.00 

3.00 

CHAR2 

0.20 

1.00 

CHAR3 

0.33 

1.00 

CHAIM 

0.33 

1.00 

CHARS 

0.11 

(o)  Slnoe  we're  not  using  subjective  evaluations,  we  can  actually  fill  in  all  oells  based 
on  the  relationships  established  in  the  first  row,  Since  CHARI  is  five  times  better  than  GHAR2 
and  three  times  better  than  CHAR3,  then  CHAR2  must  be  3/S  as  good  as  CHAR3.  Similarly, 
since  CHARI  b  five  timea  better  than  GHAR2  and  nine  times  better  than  CHARS,  then  CHAR2 
must  be  9/5  (1 .80)  times  better  than  CHARS,  and  so  on.  Consequently,  the  matrix  will  be  filled 
as  follows,  based  on  the  relationships  established  in  the  first  row. 


CHARI  CHAR2  CHAR3  CHAR4  CHARS 


CHARI 

1.00 

5.00 

3.00 

3.00 

9.00 

CHAR2 

0.20 

1.00 

0,60 

0.60 

1.80 

CHAR3 

0.33 

1.67 

1.00 

1.00 

3.00 

CHAIM 

0.33 

1.67 

1.00 

1.00 

3.00 

CHARS 

0.11 

O.SS 

0.33 

0.33 

1.00 

(2)  STEP  2.  Compute  the  priority  vector.  Mathematically,  this  is  roughly  equivalent 
to  normalizing  the  principal  eigenvector. 

(a)  For  eaoh  row,  take  the  nth  root  of  the  product  of  the  n  numbers  in  the  row ,  as 
follows.  This  is  all  done  automatically  in  the  model,  but  to  translate  this  to  Lotus  1-2-3  you  must 
know  the  process  occurring  within  the  model. 
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CHARI  CHAR2  CHAR3  CHAR4  CHAR5 


CHARI 

1.00 

5.00 

3.00 

3.00 

9.00 

405.00 

3.32 

CHAR2 

0.20 

1.00 

0.60 

0.60 

1.80 

0.13 

0.66 

CHAR3 

0.33 

1.67 

1.00 

1.00 

3.00 

1.65 

1.11 

CHAR4 

0.^3 

1.67 

1.00 

1.00 

3,00 

1.65 

1.11 

CHARS 

0.11 

0.55 

0.33 

0.33 

1.00  • 

0.007 

0.37 

(b)  Normalize  this  last  vector  by  dividing  each  number  by  the  sum  of  all  the 
numbers,  In  this  case,  the  sum  of  the  numbers  is  3 .32  +  0.66  +  1 . 1 1  + 1. 1 1  +  0.37  «  6.57;  so  the 
normalized  numbers  would  be  as  follows. 


CHARI 

3.32/6.57 

0.51 

CKAR2 

0.66/6.57 

0.10 

CHAR3 

1.11/6.57 

0.17 

CHAR4 

1.11/6.57 

0,17 

CHARS 

0.37/6.57 

0.06 

(o)  This  priority  vector  is  really  a  statement  of  the  weights  attributed  to  eaoh  of  the 
characteristics  according  to  die  pairwise  values  given  in  the  above  matrioes.  In  other  words, 
CHARI  is  considered  to  be  the  most  important  characteristic,  with  a  score  of  .51,  and  it  is  five 
timet  as  Important  as  either  CHAR3  or  CHAIM,  which  eaoh  have  a  value  of  0. 17.  Except  for  the 
mathematic  rounding  errors,  the  characteristics  have  maintained  their  original  relationship.  But 
this  is  because  we  have  not  used  subjective  values.  Had  we  used  subjective  data,  we  would  not 
have  taken  the  first  row  of  data  in  die  initial  matrix  and  automatically  formed  reciprocals.  Instead 
we  would  have  continued  to  enter  raw  subjective  entries  for  eaoh  cell,  without  regard  to 
previously  implied  relationships,  When  using  purely  subjective  means  to  acquire  the  entries,  we 
could  very  well  end  up  saying  that  CHARI  is  five  times  better  than  CHAR2  and  three  times 
better  than  CHAR3  (which  implies  that  CHAR2  is  3/S  as  good  as  CHAR3)  and  then  say  that 
CHAR2  ia  half  as  good  as  CHAR3. 

(3)  STOP  3.  Estimate  the  consistency  of  the  priority  vector.  This  will  be  our  measure 
or  indication  of  how  consistently  the  characteristics  were  compared  to  each  other  during 
development  of  the  original  matrix  of  pairwise  comparisons.  Again,  since  we  have  not  used 
subjective  data,  our  matrix  of  pairwise  comparisons  should  be  consistent.  An  example  of 
inconsistency  was  given  at  the  end  of  the  paragraph  above. 

(a)  Multiply  the  matrix  of  comparisons  by  the  priority  vector. 


COMPARISON  MATRIX 

priority 

VBCTOR 

VI 

1.00 

5.00 

3.00 

3.00 

9.00 

0.51 

2.57 

0.20 

1.00 

0.60 

0.60 

1.80 

0.10 

0.51 

0.33 

1.67 

1.00 

1.00 

3.00 

0.17 

0.86 

0.33 

1.67 

1.00 

1.00 

3.00 

0.17 

0.86 

0.11 

0.53 

0.33 

0.33 

1.00 

0.06 

0.28 
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(b)  Obtain  a  new  vector  V2  by  dividing  the  first  number  in  VI  by  the  first  element 
of  the  priority  vector;  the  second  element  of  VI  by  the  second  element  of  the  priority  vector;  and 
so  on,  , as  Mows. 

V2 


2.57/.S1  5.04 

.51/.  10  5*10 


.86/.  17  5.06 


.86/.  17  5.06 
.28/.06  4.67 


(o)  Add  the  elements  in  V2  and  divide  this  sum  by  the  nuxUber  of  elements  (i.e., 
average  the  numbers  in  V2).  In  our  example,  (5.04  +  5.10  +  3.06  +  5.06  +  4.67)/5  *  4.99;  This 
number,  4.99,  is  an  approximation  of  the  maximum  (or  principal)  eigenvalue,  abbreviated  as 
Xmax,  and  it  is  used  to  estimate  the  consistency  of  the  pairwise  comparisons.  The  closer  Xmax  is 
to.  the  number  of  rows  or  oolumni  in  the  matrix  of  comparisons,  the  more  consistent  the  pairwise 
comparisons  were, 

(d)  How  olose  is  olose?  A  method  of  evaluating  the  consistency  follows. 

-  Obtain  the  consistency  index  by  dividing  (Xmax  -  n)  by  (n  - 1).  In  our  example,, 
the  consistency  index  would  be  (4.99-5)/(5-l)  -  -.01/4  -  >,003.  Shoe  we're  only  interested  in  the 
magnitude  of  the  difference,  and  not  its  direction,  well  call  it  .003. 


-  Divide  the  consistency  Index  by  the  appropriate  random  index,  shown  in  (3) 
below,  to  obtain  the  consistency  ratio.  A  consistency  ratio  of  0. 10  or  lees  is  considered 
acceptable.  In  our  case,  the  consistency  ratio  would  be  .003/1. 12  ■  .003,  indicating  that  we 
wore  consistent  in  our  pairwise  comparisons.  If  we  bad  been  using  subjective  judgements  for  all 
our  comparisons,  the  consistency  ratio  would  help  us  catch  significant  errors  in  transitivity,  such 
as;  A  is  as  good  as  B,  Bis  twice  as  good  as  C,  and  A  is  as  good  as  C. 

-  Random  indices  for  comparison  matrices  of  up  to  1 5  rows  (or  1 S  columns). 

Number  of  Rows  Random  Index 

3  .38 

4  .90 

5  1.12 

6  1.24 

7  1.32 

8  1.41 

9  1.43 

10  1.49 

11  1.51 

12  1.48 

13  1.56 

14  1.37 

13  1.39 
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(4)  STEP  4.  Much  of  the  above  work  was  done  to  obtain  a  measure  of  consistency  for 
the  pairwise  comparisons  made  in  the  original  matrix.  The  priority  vector,  however,  is  what  we 
were 'after,  Now  we  have  to  repeat  the  process  for  the  matrix  of  alternatives  as  shown  below. 

(a)  The  set  of  alternatives  must  now  be  evaluated  in  light  of  the  above 
characteristics,  In  order  to  do  so,  pairwise  comparisons  must  be  made  with  respect  to  each 
characteristic  above,  This  means  we  will  have  five  sets  of  matrices,  one  for  each  characteristic. 

In  the  first  matrix  of  pairwise  comparisons,  the  question  we  are  asking  is:  with  respect  to 
CHARI,  how  much  better  or  more  important  is  ALT!  than  ALT2,  and  so  on. 

(b)  These  matrices,  along  with  their  priority  vectors,  maximum  eigenvalues, 
consistency  indices  (C.I.),  and  consistency  ratios  (C.R.),  are  shown  below. 


CHARI 

ALT1 

ALT2 

ALT3 

ALT1 

1.00 

0.50 

2.00 

ALT2 

2.00 

1.00 

4.00 

ALT3 

0.50 

0.25 

1,00 

CHAR2 

ALT1 

ALT2 

ALT3 

ALT1 

1.00 

1.00 

0,50 

ALT2 

1.00 

1.00 

0,50 

ALT3 

2.00 

2.00 

1.00 

CHAR3 

ALT1 

ALT2 

ALT3 

ALT1 

1.00 

2.00 

2.00 

ALT2 

0.50 

1.00 

1.00 

ALT3 

0.50 

1.00 

1.00 

CHAR4 

ALT1 

ALT2 

ALT3 

ALT1 

1.00 

1.00 

2.00 

ALT2 

1.00 

1.00 

2.00 

ALT3 

0.50 

0.50 

1.00 

CHAR5 

ALT1 

ALT2 

ALT3 

ALT1 

1.00 

1.00 

1. 00 

ALT2 

1.00 

1.00 

1.00 

ALT3 

1.00 

1.00 

1.00 

PRIORITY 

VECTOR 

XMAX 

C.R. 

0.29 

0.57 

0.14 

3.00 

0.00 

0.25 

0.25 

0.50 

3.00 

0.00 

0.50 

0.25 

0.25 

3.00 

0.00 

0.40 

0.40 

0,20 

3,00 

0.00 

0,33 

0,33 

0.33 

3,00 

0.00 

241 


(5)  STEP  5.  The  matrix  of  priority  voctors  from  the  pairwise  comparisons  of  the 
alternatives  is  now  multiplied  on  the  right  by  the  priority  vector  from  the  characteristics. 


• 

0.29 

0.25 

0.50 

0.40  •  0.33 

0.51 

0.10 

0.340 

(ALT1) 

0.57 

0.25 

0.25 

0.40  0.33 

0.17 

0.443 

(ALT2) 

0.14 

0,50 

0.25 

0.20  0.33 

0.17 

0.217 

(ALT3) 

0.06 

This  last  vector,  the  solution  vector,  shows  the  final  values  of  alternative  1  through  alternative  3. 


d.  Additional  Notes.  In  this  example,  there  was  only  one  level  of  characteristics. 
Additional  levels  may  be  considered  in  the  same  problem  by  simply  repeating  the  above 
process.  This  is  comparatively  easy  in  a  complete  hierarchy,  in  which  every  item  on  one  level  is 
related  to  every  item  on  the  next  higher  level.  Our  example  is  a  three-level,  complete  hierarchy, 
depioted  by  the  following  diagram.  Level  one  is  the  solution;  level  two  consists  of  the  set  of 
characteristics;  and  level  three  contains  the  set  of  alternatives. 


Level  1  (Goal) 


Level  2 


Level  3  (Alternatives) 


A  four-level  complete  hierarchy  might  look  like  the  following,  containing  the  solution  on  level 
one,  sets  of  characteristics  on  levels  two  and  three,  and  the  alternatives  on  level  four. 


Level  1  (Goal) 


Level  2 


Level  3 


Level  4  (Alternatives) 
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In  any  case,  the  procedure  remains  the  same.  You  compare  the  alternatives  with  respect  to  each 
of  the  characteristics  in  the  next  higher  level;  then  compare  the  characteristics  with  respect  to 
eacii  of  the  superior  characteristics  in  the  next  higher  level,  and  so  on,  developing  a  set  of 
priority  vectorc  at  eaefi  level.  Then  you  compare  the  highest  level  of  characteristics  with  respect 
to  the  solution.  Finally,  you  must  multiply  each  set  of  priority  vectors  in  the  correct  order,  to 
obtain  the  solution  vector.  The  correct  order  of  multiplication  is  as  follows:  Put  the  lowest 
level’s  set  of  priority  vectors  on  the  left  (this  will  be  the  set  of  vectors  resulting  from  oomparing 
the  alternatives  to  each  other),  and  place  each  successively  higher  set  of  vectors  to  the  right. 
Then  multipiy  the  matrices  and  vectors  from  left  to  right. 

A  slightly  more  complicated  hierarchy  is  an  "incomplete"  one,  where  each  item  on  one  level  is 
not  necessarily  related  to  every  item  on  the  level  above,  as  shown  below. 


Level  1  (Goal) 


Level  2 


Level  3 


Level  4  (Alternatives) 


The  easiest  way  to  solve  this  type  of  hierarchy  is  tc  convert  it  into  a  complete  hierarchy  by 
putting  zeros  in  the  matrix  of  comparisons  to  indicate  no  relationship  between  characteristics. 
After  that,  the  problem  is  solved  as  a  complete  hierarchy. 

If  any  questions  or  problems  arise  from  the  use  of  this  method,  nr  the  AHP  program,  contact 
Mr.  D'Errico  at  3914  Eve  Ct,  Columbus  GA  31909.  Office  phone  (706)  545-7611 . 

e.  Displays.  The  following  facsimile  screens  display  all  the  prompts,  inputs,  and  menus, 
based  on  the  example  in  the  text,  above. 
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Load  CAVBASIC  (assuming  it's  in  the  DOS  directory). 


Load  the  AHP  model  from  disk  drive  A,  and  enter  command  "RUN" 

1  . 

LOAD*A:AHP  I 


Enter  three  lines  to  describe  this  run.  Press  [Enter]  to  leave  a  line  blank. 

DESCRIPTION  OR  TITLE  FOR  THIS  ANALYSIS  (Enter  3  lines  for  title) 
AHP-Test  #1  [Enter] 

20  Dec  92  [Enter] 

JohnD'Emco  [Enter] 

^ .  ■  ■  -  -■■■  - - 
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Select  "Enter  Data"  from  the  main  menu. 


MAIN  MENU 


1- ENTER  DATA 

2  -  CHANGE  DATA 

3  •  SAVE  DATA  (As  soon  as  you  have  entered  all  data!) 

4  -  PERFORM  COMPUTATIONS 

5  -  END  PROGRAM 

(SELECT  ONE  OF  THE  ABOVE  NUMBERS) 


Enter  data  from  the  keyboard  (K),  unless  previously  saved  on  a  disk(D). 


When  retrieving  data  from  a  disk,  be  prepared  to  enter  the  disk/path/filename. 


ENTER  NAME  OF  DISK:  FILE 

?  a:ahp  Itest.dat  (This  file  was  included  on  your  modeling  disk) 


After  the  data  has  been  entered  or  changed,  re-save  it,  and  select  item  4. 

- 

MAIN  MENU 

1 -ENTBRDATA 

2  •  CHANGE  DATA 

3 -SAVE DATA 

4  •  PERFORM  COMPUTATIONS 
5 -END  PROGRAM 

(SB1 ECT  ONE  OP  THE  ABOVE  NUMBERS) 

?  4 

V  J 

Printing  will  stop  after  the  title,  alternatives,  and  characteristics  are  printed, 
in  case  you  want  to  start  printing  the  results  on  u  new  page  for  a  cleaner  look. 


DO  YOU  WANT  TO  SKIP  TO  NEXT  PAOE  ? 
(Y/N)?  Y 


After  the  results  are  printed,  select  "5"  to  end  the  program. 


■\ 


r 


MAIN  MENU 

1  -  ENTBRDATA 

2  •  CHANGE  DATA 
3- SAVE  DATA 

4  -  PERFORM  COMPUTATIONS 
3 -END  PROGRAM 

(SELECT  ONE  OF  THE  ABOVE  NUMBERS) 
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Enter  the  command  "SYSTEM"  (without  quotation  marks)  to  return  to  DOS. 


f.  Results.  The  following  results  were  based  on  the  example  given  in  the  text  above. 


AH9TEST41 


20  Dm  93 
Jota  pinks 

ALTERNATIVES  EVALUATED: 

ALTI 

ALT2 

ALT} 


CHARACTERISTICS  CONSIDERED: 


CHARI 


CHARS 

CHARS 


CKAR4 

CHARS 


CHARACTERISTIC  VALUES 
CHARI  CHARS  CHARS  CHAR4  CHARS 


CHARI 

1.00 

S.00' 

3.00 

3.00 

9.00 

CHARS 

0.30 

1.00 

0.00 

0.00 

1.10 

CHARS 

0J1 

1.07 

1.00 

1.00 

3,00 

CHAIM 

OJS 

I.S7 

1.00 

1,00 

3.00 

CHARS 

0.11 

0.SJ 

0.33 

0.33 

1.00 

I1CSNVBCTOR  MAX  EIGENVALUE;  4,99  CONSISTENCY  RATIO;  -0.00230 
0.S1 
0.10 
0.17 
0.17 
0.0S 


CHARI 

ALTI 

ALTS 

ALTS 

EIGENVECTOR 

EIGENVALUE 

ALTI 

1.00 

0.50 

2.00 

0.39 

3.00 

ALTS 

LOO 

1.00 

4.00 

0.57 

ALTS 

0.50 

0.25 

1.00 

0.14 

CONSISTENCY  RATIO  0.00000 

CHARS 

ALTI 

ALTI 

ALTS 

EIGENVECTOR 

EIGENVALUE 

ALTI 

1.00 

1.00 

0.30 

0.35 

3.00 

ALTS 

1.00 

1.00 

0,30 

0,35 

ALTS 

LOO 

LOO 

1.00 

0.50 

CONSISTENCY  RATIO:  0,0000 
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CHARS 

ALTI 

ALTS  ALTS 

EIOBNVECTOR 

EIGENVALUE 

ALTI 

1.00 

2.00 

2.00 

0.30 

3.00 

ALTS’ 

0,30 

1.00 

1.00 

0.33 

ALTS 

0.30 

1.00  1.00 
« 

0.33 

CONSISTENCY  RATIO:  0,00000 


CHAIM 

ALTI 

ALTS 

ALTS 

EIGENVECTOR 

EIGENVALUE 

ALT! 

1.00 

1.00 

2,00 

0.40 

3.00 

ALTS 

1.00 

1.00 

2.00 

0.40 

ALTS 

0.30 

0.30 

1,00 

0,30 

CONSISTENCY  RATIO:  0.0000 


CHAU 

ALTI 

ALTS 

ALTS 

EIGENVECTOR 

biobnvalub 

alTi 

1.00 

1.00 

1.00 

033 

3.00 

alt: 

1.00 

1.00 

1.00 

0,33 

ALTS 

1.00 

1.00 

1.00 

0.33 

CONSISTENCY  RATIO:  0.0000 


RAN  KINO  OP  ALTERNATIVES. 
ALTS  -  0.443 
ALTI  -  0.340 
ALTS-  0.317 
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G,  DATA  RANKING. 

1.  Introduction. 

* 

a.  Description.  This  program,  developed  by  Mr.  John  D'Errico,  takes  any  set  of  numerical 
data  as  input,  sorts  it  into  ascending  and  descending  orders,  and  provides  the  ranks  associated 
with  each. 

b.  Limitations.  This  program  can  aocept  a  maximum  of  1000  data  points. 

l 

c.  Applications.  Desktop  tool  for  data  analysis. 

d.  Setup.  This  model  runs  on  a  DOS-based  computer.  Data  entry  consists  solely  of 
entering  the  numbers  to  be  sorted  and  ranked.  Sorting  and  ranking  will  usually  take  less  than  a 
minute. 

2.  Guide  to  Operation. 

a.  Equipment  Required. 

(1)  IBM  compatible  PC  computer. 

(2)  3.5"  disk  drive. 

(3)  Printer. 


b.  Installation. 

(1)  Turn  on  the  computer  and  get  to  the  DOS  prompt. 

(2)  Insert  the  3 . 5"  disk  containing  the  RANKDATA  program  into  your  computer's 
disk  drive. 

(3)  From  the  DOS  prompt,  enter  the  command  A:RANKDATA  (or  B:RANKDATA  if 
you're  using  the  B:  disk  drive, 

c.  Definitions. 

(1)  Rank.  After  a  set  of  numbers  is  put  into  order  (ascending  order,  for  example)  the 
rank  of  each  number  is  simply  the  number  of  its  position  in  the  ordered  list.  However,  when  the 
same  number  is  repeated  on  foe  list,  their  rank  is  determined  by  averaging  the  numbers  of  their 
positions.  For  example,  assume  that  the  numbers  12,  3, 17, 1 1, 12, 6, 42,  3  must  be  ranked. 
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The  first  task  is  to  sort  the  numbers  into  (for  this  example)  ascending  order.  The  sorted  list  of 
numbers  then  becomes  3,  3,  6,  11,  12, 12, 17, 42.  The  number  3  occupies  position  1  and  2,  so 
each  3  gets  a  rank  of  1 .5  since  ( 1  +  2)/2  *  3/2  *  1.5.  The  number  6  gets  a  rank  of  3  since  it  holds 
position  3;  the  numbei’  1 1  gets  a  rank  of  4  since  it  holds  position  4;  the  numbers  12,  occupying 
positions  5  and  6,  each  get  a  rank  of  5.5  since  that's  the  average  opposition  numbers  5  And  6. 
Number  17  gets  a  rank  of  7  since  it  holds  position  7,  and  the  number  42  gets  a  rank  of  8  since  it 
holds  position  8  in  the  ordered  list  of  eight  numbers. 

(2)  Ascending  Order.  Numbers  in  ascending  order  are  listed  with  the  imaiiest  number 
at  the  top  of  the  Ust  and  the  largest  number  at  the  bottom  of  the  list. 

\ 

(3)  Descending  Order.  Numbers  in  asoending  order  are  listed  with  the  largest  number 
at  the  top  of  the  list  and  the  smallest  number  at  the  bottom  of  the  Ust 

d.  Operation. 

(1)  The  first  display  is  as  follows: 
— 

This  program  acoepts  up  to  1000  numbers,  then  prints 
the  numbers  as  entered,  fbUowed  by  the  numbers 
in  ascending  and  descending  orders  and  their 
associated  ranks.  Ranks  are  assigned  from  1  to  n. 

Tied  scores  are  assigned  the  mean  of  the  ranks  for 
which  they  are  tied. 

(Press  RETURN  to  begin  the  program) 

\ 

Screen  #1 

(2)  The  second  display  prompts  you  to  enter  score  (number)  Ul,  score  #2,  score  #3, 
etc...,  with  the  instruction  to  enter  the  number  -99  when  you  have  no  more  numbers  to  enter. 


r  1  “ 

— 

Enter  score  #1  7  3 

(Enter  -99  after  last  score  has  been  entered) 

Screen  #2 


J 
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Screen  #3 


(4)  The  following  is  a  sample  printing  from  this  program.  \ 

Date  Entered:  5  3  11  2  7  28  5  2  9  24  35  17  12  7  9 
16  12  3  7  9  2  4  6 


ASCENDING 

DESCENDING 

DATA 

RANKS 

DATA 

RANKS 

2 

2 

35 

1 

2 

2 

28 

2 

2 

2 

24 

3 

3 

4.5 

17 

4 

3 

4.5. 

16 

5 

4 

6 

12 

6.5 

5 

7.5 

12 

6.5 

5 

7.5 

11 

8 

6 

9 

9 

10 

7 

11 

9 

10 

7 

11 

9 

10 

7 

11 

7 

13 

9 

14 

7 

13 

9 

14 

6 

15 

11 

16 

5 

16.5 

12 

17.5 

5 

16.5 

12 

17.5 

4 

18 

16 

19 

3 

19.5 

17 

20 

3 

19.5 

24 

21 

2 

22 

28 

22 

2 

22 

3S 

23 

2 

22 

(5)  At  this  point  the  program  ends  and  returns  you  to  the  DOS  prompt. 
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H.  LAGRANGE  INTERPOLATION. 


1. '  Introduction. 

« 

a.  Description,  This  model,  developed  by  Mr.  John  D'Errico,  uses  Lagrange  polynomials 
to  Interpolate  between  two  points  on  a  nonlinear  cur/e. 

b.  Limitations.  This  method  is  subject  to  error  if  using  a  targe  number  of  known  data 
points  as  a  basis  for  the  interpolation. 

c.  Applications.  Desktop  tool  for  data  analysis.  '• 

d.  Setup.  This  program  runs  on  an  IBM  compatible  PC  computer.  It  takes  approximate¬ 
ly  one  minute  to  enter  five  data  points,  and  less  than  a  minute  to  display  the  Interpolation. 

2.  Guide  to  Operation. 

a.  Equipment  Required. 

(1)  IBM  compatible  PC  computer, 

(2)  3.5"  disk  drive. 

b.  Installation. 

(1)  Turn  on  the  computer  and  get  to  the  DOS  prompt. 

(2)  Insert  the  3.5"  disk  containing  the  LAGRANGE  program  into  your  disk  drive. 

(3)  From  the  DOS  prompt  enter  the  command:  AiLAGRANGE 

c.  Explanation. 

(1)  Given  a  set  of  data  points  such  as  those  in  Table  1,  there  is  often  a  need  to 
determine  a  data  point  whioh  Is  not  listed  in  the  table.  For  example,  we  might  need  to  estimate 
the  probability  of  hit  [P(H)j  at  a  range  of  1 ,2  kilometers,  based  on  the  data  in  Table  1 


RANGE 

P(H) 

0.1 

0.90 

0.5 

0,88 

1,0 

0.58 

2.0 

0.18 

__  _  J 

(2)  A  common  practioe,  due  to  its  simplicity  and  speed,  is  to  interpolate  linearly 
between  two  given  data  points.  Using  Table  1  this  would  mean  interpolating  between  the 
ranges  of  1.0  and  2.0  kilometers  in  order  to  find  the  value  not  given- the  P(H)  at  1.2  kilometers. 
However,  when  the  given  data  does  not  fall  along  a  straight  line,  linear  interpolation  is  subject 
to  gross  errors,  particularly  if  the  data  points  within  which  the  interpolation  is  done  are  not  close 
together. 


(3)  The  method  described  herein  uses  Lagrange's  form  of  interpolation  polynomials. 
This  is  a  widely  used  form  for  interpolation  within  a  set  of  given  data  points.  The  given  data 
points  may  be  equally  or  unequally  spaced,  and  may  line  along  a  nonlinear  curve. 

(4)  This  method  is  also  subject  to  errors,  particularly  if  using  a  large  number  of  given 
data  points  to  make  the  interpolation.  A  way  to  minimize  the  error  is  to  take  a  couple  of  data 
points  on  either  side  of  the  value  to  be  interpolated,  ignoring  the  data  points  whioh  are  farther 
away. 


(5)  This  method  is  presented  as  an  alternative  to  linear  interpolation,  not  as  a 
substitute,  and  it  is  to  be  used  when  a  straight  line  would  be  substantially  off  the  true  curve  of 
known  data  points,  as  shown  in  Figure  1. 


RANGE 

FIGURE  II 


(6)  This  method  should  only  be  used  to  interpolate  within  the  range  of  given  data 
points.  There  are  better  methods  for  interpolating  (or  extrapolating)  outside  the  range  of  the 
given  data  points;  namely,  Newton'a  forward  and  backward  differences,  among  others. 
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(7)  There  is  more  than  one  way  to  derive  the  approximating  interpolative  polynomial 
used  herein,  such  as  the  method  of  undetermined  coefficients;  however,  the  method  selected  here 

is  straight  forward,  and  was  easy  to  program, 

% 

(8)  Equations. 

(a)  Since  the  derivations,  proofs,  and  uniqueness  theorems  are  readily  available  in  a 
multitude  of  books  on  numerical  analysis,  these  are  not  duplicated  In  this  paper. 

(b)  Given  a  set  ofn+1  data  points  of  the  fbrm  (x,  f(x)),  the  collocation  polynomial 
(the  nth  degree  polynomial  fitting  those  points)  is 

pOO  -  $  fltyL/x) 

where,  fbr  each  j,  0  <  j  <  n,  f(xj)  is  the  given  value  along  the  y-axls  associated  with  the  given  Xj 
value  along  the  x-axis,  Lj(x)  is  the  nth  degree  polynomial  defined  as 


Mx) 


(x-x0)(x-xl)...(x-xH)(x-x^,)...(x-xB)  (x-x,) 

(xj-x0)(xj-x1),..(xrxj<l)(xj-x^1)...(xj.xi)  H  (Xj-Xi) 


d.  Operation. 


(1)  Using  the  data  in  Table  1,  assume  that  we  want  to  estimate  the  probability  of  hit  at 
1,2  kilometers.  Table  1  is  repeated  below. 


- \ 

RANGE 

P(H) 

0.1 

0.90 

0,3 

0.88 

1.0 

0.38 

2.0 

0.18 

^ - 

J 

Table  1 


(2)  The  LAGRANGE  program's  first  screen  asks  if  you  want  a  program  description 
and  explanation  displayed  on  the  screen.  For  this  example  we  will  select  Y(es). 


If  you  want  a  brief  program  description/explanation,  enter 
Y  or  y  and  press  [Enter];  otherwise,  simply  press  [Enter]. 

?  y 


Screen  #1 
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(3)  The  next  three  displays  consist  of  the  explanation. 


DESCRIPTION 

This  program  accept!  any  number  of  (x,y)  coordinates, 
determines  the  Lagrange  form  of  interpolathw  polynomial 
which  fits  the  (x,y)  data  points,  and  then  asks  the  user  to  enter  any 
number  at  awatooe  for  which  a  y-valoe  must  be 
predicted 

It  is  recommended  that  this  program  be  used  to  interpolate 


Vlv  1  it  i>T- -it  '•I'iJi'Y.Ck'i  ipr.  f(' wiJvlIAlJJ 


and  that  only  3  to  6  coordinates  of  known  points  be  used  for  this 
Interpolation. 

(Press  [Enter]  to  continue) 


Screen  #2 


(4)  Continued  description: 


When  entering  the  first  set  of  data,  simply  enter  the  x-value  and  y-value, 
separated  by  a  comma,  and  presa  [Enter]  after  each  pair  of  coordinates. 
For  example,  to  enter  the  oooidinatai  (14).  (2.4),  and  (3,9)  yon  would 
first  order  a  •3"  in  response  to  "Enter  the  number  of  known  (x^y)  data 
points."  Then  you  would  enter  the  three  coordinates  u  follows: 

1,2  [Enter] 

2,4  [Enter] 

3,9  [Enter] 

(Press  [Enter]  to  continue) 


(3)  Final  screen  of  descriptions. 

/ - 

After  you  have  entered  tbs  known  (x,y)  coordinates,  you  will  be  asked 
to  enter  the  number  of  x-valuee  for  which  you  need  predicted  y-valuee. 
Simply  enter  the  number  of  x-valuee  for  which  you  need  y-valuee 
interpolated  Finally,  you  will  be  asked  to  enter  the  x-valuee,  one  el  a 
time,  pressing  the  [Enter]  key  after  each  x-vaiue  entry. 


Screen  #4 
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(6)  Now  you  will  be  prompted  to  enter  the  number  of  known  data  points. 


Screen  #5 

(7)  The  next  four  prompts  ask  you  to  enter  the  data  points.  Only  the  first  prompt  is 
shown  here,  since  the  remaining  three  are  identical  except  for  the  coordinate  entered. 


Enter  X.Y  for  Data  Point  #1 
?  .1,  .9 


(Enter  the  X  and  Y  values,  separated  by  a  comma) 


Screen  #6 


(8)  The  next  prompt  asks  for  the  number  of  x-values  for  which  you  need  a  y- value. 
Only  one  y-value  is  requested  in  this  example-the  P(H)  at  1.2  kilometers. 


Enter  the  number  of  x-valuea  for  which  you  need 
a  y*value  predicted. 


Screen  #7 


(9)  Now  you  must  enter  the  single  x-value,  For  this  example  the  response  is  12, 
representing  1.2  kilometers. 


Enter  X  value  #  1 


?  1.2 


Screen  #8 


(10)  The  final  display  lists  the  x-values  and  y-values  you  entered,  followed  by  the 
x-value  and  y-value  you  needed  interpolated. 


r 

> 

X 

Y 

.i 

.9 

.5 

.88 

1 

.58 

2 

.18 

1.2 

.4347836 

c:\> 

Screen#) 


(1 1)  As  you  can  see  from  Screen  # 9 ,  the  program  has  ended  and  returned  you  to  the 
prompt  you  started  with-in  this  case,  the  root  directory. 


I,  FUNDAMENTAL  DUEL 

1,  Introduction. 

t 

a.  Description.  This  model  (Reference  4,  chapter  17)  depicts  the  outcome  of  two 
opposing,  single  shot,  direct  fire  weapon  systems,  each  having  an  unlimited  amount  of 
ammunition.  Inputs  required  are  each  weapon's  reliability,  rate  of  fire,  and  probability  of  kill 
given  a  single  shot.  The  results  are  displayed  in  terms  of  the  probability  that  the  Blue  weapon 
wins  the  duel  and  the  probability  that  the  Red  weapon  wins  the  duel. 

b.  Limitations.  This  model  evaluates  the  outcome  of  a  simple  one-on-one  duel,  based  on 
rates  of  fire  and  exponentially  distributed  firing  times  between  rounds. 

c.  Applications,  Desktop  analytic  tool  for  applying  a  simple  concept  to  evaluations  of 
single  shot  weapon  systems. 

d.  Setup.  This  model  runs  on  an  IBM  compatible  PC  computer  equipped  with  Lotus 
1-2-3.  Data  can  be  entered  into  the  model  and  results  displayed  in  less  than  a  minute. 

2.  Guide  to  Operation. 

a.  Equipment  Required. 

(1)  IBM  PC  compatible  computer. 

(2)  3.5"  disk  drive, 

(3)  Lotus  1-2-3  spreadsheet  software. 

b.  Installation. 

(1)  Turn  on  your  computer  and  activate  Lotus  1-2-3. 

(2)  Insert  the  3.5"  disk  containing  the  FUNDUEL  model  into  your  computer's  3.5" 
disk  drive. 

(3)  From  the  Lotus  1-2-3  menu,  load  the  A:FUNDUEL.WK1  model  (or  B:FUNDUEL 
if  you're  working  from  the  b:  drive)  by  entering  /FR,  then  backspace  to  erase  the  default  path,  and 
enter  A:  and  press  the  [Enter]  key.  After  the  Lotus  1-2-3  files  are  shown,  select  the  FUNDUEL 
file. 
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o.  Operation. 

(1)  Move  the  cursor  to  the  cell  you  wish  to  change.  This  should  be  cell  B2,  B3,  B4, 
B6,  B7,  B8,  BIO*  or  fill. 

(2)  As  you  make  a  change  in  one  ceil,  the  probabilities  of  Blue  and  Red  winning  (cells 
FS  and  F6  respectively)  are  automatically  i  ecalculated  for  a  practically  instantaneous*  answer. 


r 

'."-i  - 

THE  FUNDAMENTAL  DUEL 

,  . 

REL-B 

i 

'  .  ' 

1 

PRHTT-B 

0.6 

.  • 

PR  K/H'B 

0.7 

i 

PSSK-B 

0.42 

PROB-B 

0.567*68 

REL-R 

1 

■  1  1 

PR  HTT-R 

0.8 

PROB-R 

0,432432 

PRK/H-R 

0.8 

PSSK-R 

0.64 

CHECK 

1 

ROF-B 

2 

ROF-R 

I 

Screen#! 


d.  Definitions. 


REL-B:  Reliability  of  the  Blue  weapon  system, 

PR  HIT-B:  Blue  weapon's  probability  of  hitting  the  Red  target. 

PR  K/H-B:  Blue  weapon's  probability  of  kill,  given  a  hit,  on  the  Red  target. 

PSSK-B;  Blue  weapon's  probability  of  kill  given  a  single  shot  at  the  Red  target.  It  is  the 
product  of  the  probability  of  hit  and  the  probability  of  kill  given  e  hit. 

REL-R:  Reliability  of  the  Red  weapon  system. 

PR  HTT-R:  Red  weapon's  probability  of  hitting  the  Blue  target. 

PR  K/H-B:  Red  weapon's  probability  of  kill,  given  a  hit,  on  the  Blue  target. 
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PSSK-R:  Red  weapon's  probability  of  kill  given  a  single  shot  at  the  Blue  target.  It  is  the 
product  of  the  probability  of  hit  and  the  probability  of  kill  given  a  hit. 

ROF-B:  Blue  weapon's  rate  of  fire. 

ROF-R:  Red  weapon's  rate  of  fire. 

PROB-B:  The  probability  that  Blue  wins  the  duel.. 

PROB-R  The  probability  that  Red  wins  the  duel. 

CHECK:  Verification  of  the  equations.  It  is  the  sum  of  PROB-B  and  PROB-R,  which  should 
be  equal  to  1.0Q. 

' t'  ■ 

e.  Explanation. 

(1)  Background.  In  a  fundamental  duel,  it  is  hypothesized  that  two  duelists,  Blue  (B) 
and  Red  (R),  fire  at  eaoh  other  until  one  is  put  out  of  action.  The  firing  times,  or  time  between 
rounds,  fbr  eaoh  duelist  is  considered  to  be  of  a  random  character  with  known  probability  density 
functions,  the  parameters  fbr  which  may  be  different  for  Blue  and  Red.  At  the  start  of  the 
engagement,  each  contestant  loads,  aims,  and  fires  his  first  round  at  his  opponent.  Thus,  in  the 
fundamental  duel,  both  start  with  unloaded  weapons.  It  it  also  assumed  here  that  each  time  Blue 
and  Red  fire  at  eaoh  other  they  have  constant  single  shot  kill  probabilities,  although  such  kill 
probabilities  of  Blue  and  Red  may  be  different.  Both  Blue  and  Red  have  unlimited  ammunition 
supplies,  so  that  a  kill  is  certain. 

(2)  Definitions. 

mean  rate  of  fire  of  Blue  (B) 
f\  •  mean  rate  of  fire  of  Red  (R) 
pt  ■  single  shot  kill  probability  of  Blue  against  Red 
pK  «  singlo  shot  kill  probability  of  Red  against  Blue 
P(B)  ■  chance  that  B  wins  the  duel 
P(R)  *  chance  that  R  wins  the  duel  ■  1-  P(B) 

(3)  The  mean  rates  of  fire,  ^  and  are,  respectively,  the  reciprocals  of  the  mean 
times  between  rounds  fired  by  Blue  and  Red. 

(4)  The  single  shot  chances  of  kill,  p,  andp*  may  be  built  up  or  determined  by  taking 
the  product  of  the  chance  of  a  hit  and  the  conditional  probability  that  a  hit  is  a  kill;  i.e., 

Pn  -/>8(b)/>.(k|h)  andp* -p*(h)p*(kih). 

(5)  Finally,  we  make  an  assumption  that  appears  of  practical  value;  namely,  that  the 
time  to  fire  the  first  round  and  the  times  between  rounds  fired  for  B  and  R  follow  single 


parameter  negative  exponential  distributions.  So,  for  random  times  t 
f(t)  ■  p  exp  (-  pt),  where  p  »  p^  or  p#  as  needed.  Mean  time  between  rounds  ■  1  Ip. 

(6)  Since  the  exponential  distribution  la  equivalent  to  the  chi-square  distribution  with 
two  degrees  of  freedom,  this  means  that  the  time  at  which  the  nth  round  is  fired  is  the  sum  of  n 
independent  selections  from  the  above  equation,  or  the  chi-square  distribution  with  2n  degrees  of 
freedom  (or  the  gamma  distribution)  given  by 

n-l)l 


Then,  the  ohance  that  Blue  wins  is:  P(B)  ■  — - . . 

PiPi+Ptfi* 


and  the  chance  that  Rod  wins  is  P(R)  ■  1  ■  P(B)  -  . . 

P*P%*P*P% 

(7)  Consequently,  for  exponentially  distributed  firing  times  between  rounds,  the  chance 
that  a  side  wins  is  the  kill  rate  for  that  side  divided  by  the  sum  of  the  kill  rates  for  both  sides, 
which  is  a  rather  simple  outcome.  Hence,  the  value  of  kill  rate  as  a  key  measure  of  effectiveness 
is  evident.  Note  that  if  the  single  shot  kill  probabilities  of  B  and  R  are  equal,  then  their  rates  of 
fire  take  over;  and  if  their  rates  of  fire  also  are  equal,  each  B  and  R  have  a  50%  chance  of 
winning.  The  chanoe  of  a  draw,  or  both  being  killed,  is  zero. 
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J.  FUNDAMENTAL  DUEL  WITH  LIMITED  AMMUNITION  FOR  BLUE 

1.  Introduction. 

» 

a.  Description.  This  model  (Reference  4,  chapter  1 7)  depicts  the  outcome  of  two 
opposing,  single  shot,  direct  fire  weapon  systems  when  the  Blue  weapon  system  has  a  limited 
amount  of  ammunition.  Inputs  required  are  each  weapon's  reliability,  rate  of  fire,  and  probability 
of  kill  given  a  single  shot.  The  results  are  displayed  in  terms  of  the  probability  that  the  Blue 
weapon  wins  the  duel  and  the  probability  that  the  Red  weapon  wins  the  duel. 

b.  Limitations.  This  model  evaluates  the  outcome  of  a  simple  one-on-one  duel,  baaed  on 
rates  of  fire,  probabilities  of  kill,  and  exponentially  distributed  firing  times  between  rounds. 

c.  Applications.  Desktop  analytic  tool  for  applying  a  simple  ooncept  to  evaluations  of 
single  shot  weapon  systems. 

d.  Setup.  This  model  runs  on  an  IBM  compatible  PC  computer  equipped  with  Lotus 
1-2-3.  Data  can  be  entered  into  the  model  and  results  displayed  in  less  thsui  a  minute. 

2.  Guide  to  Operation. 

a.  Equipment  Required. 

(1)  IBM  PC  compatible  computer. 

(2)  3.5H  disk  drive. 

(3)  Lotus  1-2-3  spreadsheet  software. 

b.  Installation. 

( 1 )  Turn  on  your  computer  and  activate  Lotus  1  -2-3 . 

(2)  Insert  the  3.5"  disk  containing  the  LIMAMMOB  model  into  your  computer's  3.5" 
disk  drive. 

(3)  From  the  Lotus  1-2-3  menu,  load  the  A: LIMAMMOB  model  (or  B:LIMAMMOB 
if  you're  working  from  the  b:  drive)  by  entering  /FR,  then  backspace  to  erase  the  defhult  path,  and 
enter  A:  and  press  the  [Enter]  key.  After  the  Lotus  1-2-3  files  are  shown,  select  the 
LIMAMMOB  file. 
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o.  Operation, 


(1)  Move  the  cursor  to  the  ceil  you  wish  to  change.  This  should  be  cell  B2,  B3,  B4, 
B6,  B7,  B8,  BIO,  Bll.'or  B12. 

(2)  As  you  make  a  change  in  one  cell,  the  probabilities  of  Blue  and  Rod  winning  (cells 
B3  and  E4  respectively)  are  automatically  recalculated  for  a  practically  instantaneous  answer. 

- - — 
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d.  Definitions. 

REL-B:  Reliability  of  the  Blue  weapon  system. 

PR  HIT-B:  Blue  weapon's  probability  of  hitting  the  Red  target. 

PR  K/H-B:  Blue  weapon's  probability  of  kill,  given  a  hit,  on  the  Red  target. 

PSSK-B:  Blue  weapon's  probability  of  kill  given  a  single  shot  at  the  Red  target.  It  is  the 
product  of  the  probability  of  hit  and  the  probability  of  kill  given  a  hit. 

REL-R:  Reliability  of  the  Red  weapon  sy: r  cm. 

PR  HIT-R;  Red  weapon's  probability  of  hitting  the  Blue  target. 

PR  K/H-B:  Red  weapon's  probability  of  kill,  given  a  hit,  on  the  Blue  target. 
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PSSK-R:  Red  weapon's  probability  of  kill  given  a  single  shot  at  the  Blue  target.  It  is  the 
product  of  the  probability  of  hit  and  the  probability  of  kill  given  a  hit. 

ROF-B;  Blue  weapon's  rate  of  fire. 

ROF-R:  Red  weapon's  rate  of  fire. 

PROB-B:  The  probability  that  Blue  wins  the  duel. 

PROB-R:  The  probability  that  Red  wins  the  duel. 

ROUNDS-B:  The  number  of  rounds  available  to  the  Blue  weapon  system. 

,  CHBCK:  Verification  of  the  equations,  It  is  the  sum  of  PROB*B  and  PROB-FC^which  should 
be  equal  to  1.00. 

e.  Explanation. 

(1)  This  model  uses  the  same  parameters  as  the  fundamental  duel  where  both  sides 
have  unlimited  amounts  of  ammunition,  except  that  now  Blue  is  limited  by  N-rounds. 

(2)  When  Blue  has  a  fixed  number  of  rounds  equal  to  N,  and  Red  has  an  unlimited 
supply  of  ammunition,  then  fbr  the  assumption  of  exponential  firing  times  between  rounds,  the 
chiutoe  that  Blue  wins  is  given  by 

P(B)  -  bwWk  +/w)3  ( 1  *  [*r*pJ(p*p*  + 
and 

P(BR)-0 

Note:  rife  "  1  "P%  m  single  shot  survival  probability  fbr  Red  when  fired  on  by  Blue. 

P(BR)  -  chance  of  a  draw  (B  and  R  kill  each  other). 

(3)  See  paragraph  1.2.e.  fbr  additional  explanations  of  the  fundamental  duel. 
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J.  FUNDAMENTAL  DUEL  WITH  LIMITED  AMMUNITION  FOR  RED 

1.  Introduction. 

i 

a,  Description.  This  model  (Reference  4,  chapter  17)  depicts  the  outcome  of  two 
opposing,  single  shot,  direct  fire  weapon  systems  when  the  Red  weapon  system  has  a  limited 
amount  of  ammunition.  Inputs  required  are  each  weapon's  reliability,  rate  of  fire,  and  probability 
of  kill  given  a  single  shot.  The  results  are  displayed  in  terms  of  the  probability  that  the  Blue 
weapon  wins  the  duel  and  the  probability  that  the  Red  weapon  wins  the  duel. 

b.  Limitations.  This  model  evaluates  the  outcome  of  a  simple  one-on-one  duel,  based  on 
rates  of  fire,  probabilities  of  kill,  and  exponentially  distributed  firing  times  between  rounds. 

o.  Applications;  Desktop  analytic  tool  for  applying  a  simple  concept  to  evaluations  of 
single  shot  weapon  syttema. 

d.  Setup.  This  model  runs  on  an  IBM  compatible  PC  computer  equipped  with  Lotus 
1-2-3.  Data  can  be  entered  into  the  model  and  results  displayed  in  less  than  a  minute. 

2.  Guide  to  Operation. 

a.  Equipment  Requirod. 

(1)  IBM  PC  compatible  computer 

(2)  3.5"  disk  drive. 

(3)  Lotus  1-2-3  spreadsheet  software. 

b.  Installation. 

(1)  Turn  on  your  computer  and  activate  Lotus  1-2-3. 

(2)  Insert  the  3.5"  disk  containing  the  UMAMMOR  model  into  your  computer's  3.5" 
diskdrive. 

(3)  From  the  Lotus  1-2-3  menu,  load  the  A:LIMAMMOR  model  (or  B:LIMAMMOR 
if  you're  working  from  the  b:  drive)  by  entering  /FR,  then  backspace  to  erase  the  default  path,  and 
enter  A:  and  press  the  [Enter]  key.  After  the  Lotus  1-2-3  files  are  shown,  select  the 
LIMAMMORfile. 
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c.  Operation. 


(1)  Move  the  cursor  to  the  cell  you  wish  to  change.  This  should  be  cell  B2,  B3,  B4, 
B6,  B7,  B8,  BIO,  B1  f,  or  B12. 

(2)  As  you  make  a  change  in  one  cell,  the  probabilities  of  Blue  and  Red  winning  (cells 
E3  and  E4  respectively)  are  automatically  recalculated  for  a  practically  instantaneous  answer. 

f  BASIC  DUEL  (LIMITED  AMMO  FOR  RED)  | 


REL-B 

1 

\ 

PR  HIT-B 

0.6 

PR  K/H-B 

0.7 

PSSK-B 

0.42 

PROB-B 

0.567582 

REL-R 

1 

PROB-R 

0.432418 

PR  HIT-R 

0.8 

CHECK 

1 

PRK/H-R 

0,8 

PSSK-R 

0.64 

ROF-B 

2 

ROF-R 

1 

ROUNDS-R 

5 

Screer 


d.  Definitions. 

REL-B:  Reliability  of  the  Blue  weapon  system. 

PR  HIT-B:  Blue  weapon's  probability  of  hitting  the  Red  target. 

PR  K/H-B:  Blue  weapon's  probability  of  kill,  given  a  hit,  on  the  Red  target. 

PSSK-B:  Blue  weapon's  probability  of  kill  given  a  single  shot  at  the  Red  target.  It  is  the 
product  of  the  probability  of  hit  and  the  probability  of  kill  given  a  hit. 

REL-R:  Reliability  of  the  Red  weapon  system. 

PR  HIT-R:  Red  weapon's  probability  of  hitting  the  Blue  target. 

PR  K/H-B:  Red  weapon's  probability  of  kill,  given  a  hit,  on  the  Blue  target. 
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PSSK-R:  Red  weapon's  probability  of  kill  given  a  single  shot  at  the  Blue  target.  It  is  the 
product  of  the  probability  of  hit  and  the  probability  of  kill  given  a  hit. 

ROF-B:  Blue  weapon's  rate  of  fire. 

ROF-R:  Red  weapon's  rate  of  fire, 

PROB-B:  The  probability  that  Blue  wins  the  duel.. 

PRQB-R-  The  probability  that  Red  wina  the  duel. 

ROUNDS-R:  The  number  of  rounds  available  to  the  Red  weapon. 

CHECK:  Verification  of  the  equation!.  It  is  the  sum  of  PROB-B  and  PROB-R,  which  should 
be  equal  to  1.00. 

e.  Explanation. 

(1)  This  model  uses  the  same  parameters  as  the  fundamental  duel  where  both  sides 
have  unlimited  amounts  of  ammunition,  exoept  that  now  Red  is  limited  by  N-rounds. 

(2)  When  Red  has  a  fixed  number  of  rounds  equal  to  M,  and  Blue  has  an  unlimited 
supply  of  ammunition,  then  fbr  the  assumption  of  exponential  firing  times  between  rounds,  the 
chance  that  Blue  wins  is  given  by 


P(B)  -  — 


PtPi+PfPk 


rfiPk 


P(BR)-0. 

Note:  ■  1  -  pt  m  single  shot  survival  probability  for  Blue  when  fired  on  by  Red. 

P(BR)  -  chance  of  a  draw  (B  and  R  kill  each  other). 

(3)  See  paragraph  I.2.e.  for  additional  explanations  of  the  fundamental  duel. 
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K.  LANCHESTER'S  SQUARE  LAW  AS  A  FUNCTION  OF  TIME 


1‘,  Introduction. 

i 

a.  Description.  This  model  (Reference  4,  chapter  28)  determines  the  remaining  -Blue 
forces  and  remaining  Red  forces  at  any  given  time  during  a  battle  between  homogeneous  forces. 
Inputs  required  for  each  side  are  the  total  number  of  weapon  systems,  and  each  weapon's 
probability  of  hit,  probability  of  kill  given  a  hit,  and  rate  of  fire. 

b.  Limitations.  This  model  evaluates  the  outcome  of  one  set  of  identical  weapon  systems 
against  an  opposing  set  of  identical  weapon  systems.  It  is  based  on  each  weapon's  constant  kill 
rate  of  opposing  forces. 

o.  Applications.  Desktop  analytic  tool  for  evaluating  homogeneous  force  effectiveness  in 
terms  of  a  basic  concept. 

d.  Setup.  This  model  runs  on  an  IBM  compatible  PC  computer  equipped  with  Lotus 
1*2-3.  Data  can  be  entered  into  the  model  and  results  displayed  in  less  thim  a  minute. 

2.  Guide  to  Operation. 

a.  Equipment  Required. 

(1)  IBM  PC  compatible  computer. 

(2)  5.25'  disk  drive. 

(3)  Lotus  1-2-3  spreadsheet  software. 

b.  Installation. 

(1)  Turn  on  your  computer  and  activate  Lotus  1-2-3. 

(2)  Insert  the  3.5"  disk  containing  the  LANBASIC  model  into  your  computer's  3.5" 
disk  drive. 

(3)  From  the  Lotus  1-2-3  menu,  load  the  A:LANBASIC  model  (or  B:LANBASIC  if 
you're  working  from  the  b:  drive)  by  entering  /FR,  then  backspace  to  erase  the  defhult  path,  and 
enter  A:  and  press  the  [Enter]  key.  After  the  Lotus  1-2-3  files  are  shown,  select  the  LANBASIC 
file. 
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c.  Operation. 


(1)  The  model  parameters  are  shown  below. 

a 

(  \ 
BASIC  LANCHESTER  EQUATION  AS  FUNCTION  OF  TIME  I 

BATTLE  BLUB  RED  1 


INITIAL  FORCE  SIZE' 

100 

50 

P(H)* 

0.5 

0.5 

P(KJH)‘ 

0.25 

0.5 

RATS  OF  FIRE* 

0.4 

V  0.4 

CONSTANT  KILL  RATS 

0.05 

0.1 

TIME  ELAPSED  (D* 

2 

STRENGTH  AT  TIME  T 

90.97 

40.47 

FORCE  ADVANTAGE  RATIO 

2.25 

0.44 

LBR  AT  TIMET 

1.05 

TIME  OF  ANNIHILATION 

ERR 

12.4645 

0.141421  0.141893  1.010017  I 

l  1.414214  0.707107  J 

Screen  #1 

(2)  Move  the  curoor  to  the  cell  you  wish  to  change.  You  may  change  only  the  items  in 
Screen  #1  which  are  marked  with  an  asterisk.  In  Screen  #1  an  error  (ERR)  is  shown  in  cell  F13 
because  the  given  data  has  the  Red  force  annihilated  before  the  Blue  force;  consequently,  Blue 
cannot  be  annihilated  (no  force  remaining). 

d.  Definitions. 

Initial  Force  Size;  Number  of  identical  weapons  on  the  Blue  or  Red  side. 

P(H):  A  weapon's  probability  of  hit  against  the  opposing  force  target. 

P(K|H):  Probability  of  kill  given  a  hit  against  the  opposing  force  target. 

Rate  of  Fire:  The  rate  of  fire  in  rounds  per  time  unit  (usually  minutes). 

Constant  Kill  Rate:  The  constant  rate  at  which  a  single  weapon  kills  an  opposing  force 

target. 

Time  Elapsed:  The  battle  time. 

Strength  at  Time  T:  The  number  of  Blue  or  Red  weapons  remaining  at  the  end  of 

timeT. 
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Force  Advantage  Ratio:  The  number  of  friendly  weapons  divided  by  the  number  of 
opposing  force  weapons  after  time  T. 

LER  at  Time  T:  The  number  of  Red  losses  divided  by  the  number  of  Blue  losses. 

Time  of  Annihilation:  The  time  at  which  there  are  no  remaining  weapons  on  that  side. 

An  error,  indicated  by  "ERR"  will  be  displayed  on  the  side  of  the  force  which  has  weapons 
remaining  after  the  opposing  force  has  been  annihilated.  This  is  because  the  winning  force  cannot 
be  annihilated  after  ail  opposing  weapons  have  been  destroyed. 

Note:  The  data  appearing  in  rows  14  and  15  are  intermediate  calculations. 

e.  Explanation  of  Lanchestert  square  law  as  a  function  of  time. 

(1)  Definitions. 

B0  ■  Initial  Blue  strength 
Rq  m  Initial  Red  strength 
B  -  Size  of  the  Blue  force  at  any  time  t 
R  ■  Size  of  the  Red  force  at  any  time  t 

p  -  Constant  rate  at  which  a  single  Blue  weapon  kills  a  Red  weapon 
P  "  Constant  rate  at  which  a  single  Red  weapon  kills  a  Blue  weapon 

(2)  The  remaining  Blue  forces  B(l)  and  remaining  Red  forces  R(t)  at  any  time  t  are 

given  by 

B(1)  coshvflT  t  -  Rjkh\ff#  t 
R(t)  -  /^cosh^T  t -JR  Bjuntk*Jf$  t 

(3)  The  time  at  which  Red  is  annihilated  (i.e.,  R(i)  »  0)  is  given  by 

V  -  [l/(2vWl  la  v  3  WM  -  7? 

(4)  Similarly,  if  Red  wins,  then  Blue's  time  of  annihilation  (i.e.,  B(t)  ■  0)  is  given  by 

I,  -  (1/(2 v®)]  la  +  3  W<&  R,  - 
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L.  DUEL  WHEN  BLUE  HAS  A  WEAPON  FAILURE  RATE 


1.  Introduction. 

t 

a.  Description.  This  model  (Reference  4,  chapter  17)  depicts  the  outcome  of  two 
opposing,  single  shot,  direct  fire  weapon  systems,  each  having  an  unlimited  amount  of 
ammunition,  as  in  the  fundamental  duel,  but  including  the  idea  of  weapon  fitilure  times.  Inputs 
required  are:  each  weapon's  probability  of  hit,  probability  of  kill  given  a  hit,  round  reliability,  and 
rate  of  fire;  the  number  of  Blue  weapons,  and  the  Blue  weapon  failure  rate.  Results  are  displayed 
in  terms  of  the  probability  that  Blue  wins  the  duel  and  the  probability  dipt  Red  wins  the  duel. 

b.  Limitations.  Only  homogeneous  forces  are  used  in  this  model.  Blue  and  Red  have 
unlimited  ammunition  supplies;  Blue  has  a  limited  number  of  weapons,  and  Red  has  a  failure-free 
weapon. 

c.  Applications.  Desktop  analytic  tool  for  applying  simple  failure  rates  to  evaluations  of 
single  shot  weapon  systems. 

d.  Setup.  This  model  runs  on  an  IBM  PC  compatible  computer  equipped  with  Lotus 
1-2-3.  Data  can  be  entered  into  the  model  and  results  displayed  In  less  than  a  minute. 

2.  Guide  to  Operation. 

a.  Equipment  Required. 

(1)  IBM  PC  compatible  computer. 

(2)  5.25'  disk  drive. 

(3)  Lotus  1-2-3  spreadsheet  software. 

b.  Installation. 

(1)  Turn  on  your  computer  and  activate  Lotus  1-2-3. 

(2)  Insert  the  3.5"  disk  containing  the  DLFAILB  model  into  your  computer's  3.5"  disk 

drive. 


(3)  From  the  Lotus  1-2-3  menu,  load  the  A:DLFAILB  model  (or  B:DLFAILB  if  you’re 
working  from  the  b:  drive)  by  entering  /FR,  then  backspace  to  erase  the  default  path,  and  enter 
A:  and  press  the  [Enter]  key.  After  the  Lotus  1-2-3  files  are  shown,  select  the  DLFAILB  file. 


c.  Operation. 


(1)  The  model  parameters  are  shown  below. 

C  DUELS  WITH  WEAPON  FAILURE  RATES  FOR  BLUB  | 


REL  OF  BLUE  RD* 

I 

PROB  HIT  BLUE  RD* 

0.6  . 

PROB  KJH  BLUE  RD* 

0.7 

PSSKBLUERD 

0.42 

REL  OF  RED  RD* 

1 

PROB  HIT  RED  RD* 

0.8 

PROB  KJH  RED  RD* 

0.8 

PSSK  RED  RD 

0.64 

ROF  BLUE* 

2 

ROFRED* 

1 

NUM  BLUE  WPNS* 

1 

BLUE  WPN  FAIL  RATE* 

0.02 

0.04 

PROB  BLUB  WINS 

0.352632 

Screen  #1 

(2)  Move  the  cursor  to  the  cell  you  wish  to  change.  You  may  change  only  the  items  in 
Screen  #1  which  are  marked  with  an  asterisk. 

d.  Definitions. 

REL  OF  BLUE  RD:  Reliability  of  the  Blue  round. 

PROB  HIT  BLUE  RD:  Blue  weapon's  probability  of  hitting  the  Red  target. 

PROB  K|H  BLUE  RD:  Blue  round's  probability  of  killing  the  Red  target  given  a  hit. 
PSSKBLUERD:  The  product  of  the  above  three  inputs. 

REL  OF  RED  RD:  Reliability  of  the  Red  round. 

PROB  HIT  RED  RD:  Red  weapon's  probability  of  hitting  the  Blue  target. 

PROB  K{H  RED  RD:  Red  round's  probability  of  killing  the  Blue  target  given  a  hit 
PSSK  RED  RD:  The  product  of  the  above  three  inputs. 
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ROF  BLUE:  The  Blue  weapon's  rate  of  fire. 

ROF  RED:  The  Red  weapon's  rate  of  fire. 

% 

NUM  BLUE  WPNS:  The  number  of  P'.ue  weapons  in  the  Blue  force. 

BLUE  WPN  FAIL  RATE:  Failure  rate  of  the  Blue  weapon. 

PROP  BLUB  WINS:  The  probability  that  Blue  wins  the  duel. 

PROB  RED  WINS:  The  probability  that  Red  wins  the  dud.  » 
e.  Explanation. 

(1)  Definitions. 

P(S) m  Probability  that  Blue  wins  the  duel, 
ft  ■  Mean  rate  of  fire  for  a  Blue  weapon. 

Pn  m  Mean  rate  of  fire  for  a  Red  weapon. 

p,  *  Single  shot  kill  probability  of  Blue  against  Red. 

p%  »  Single  shot  kill  probability  of  Red  against  Blue. 

M,  -  Mean  Mure  rate  for  a  Blue  weapon, 
p*  -  Mean  Mure  rate  for  a  Red  weapon. 

(2)  Blue's  and  Red's  weapon  Mure  times  are  assumed  to  be  exponentially  distributed, 
with  mean  Mine  times  i/Pi  and  1/pi,  respectively,  or  mean  Mure  rates  offend  Pr.  If  we 
further  assume  that  Blue  and  Red  have  unlimited  ammunition  supplies,  Blue  has  a  limited  number 
of  weapons  N,  and  Red  has  a  Mure-fiee  weapon  (p*  -  0),  then  the  chances  that  Blue  and  Red 
win  are 

PiPm 

P  (£)- - 

PtPt +  P*P* 

?(R)  “  1  -  P (£),  P (BR)  -  P(Draw)  -  0 


1  - 


274 


M.  ESTIMATING  OPERATIONAL  AVAILABILITY 

1.  Introduction. 

« 

a.  Description.  A  method  for  estimating  operational  availability  baaed  on  a  corribina-  tion 
of  teat  data  and  parameter  estimated  from  other  sources.  This  model  was  developed  by  Fred 
Bernstein,  Eugene  Dutoit,  and  Greg  Meyers  (Reference  5). 

b.  Applications.  Desktop  model  for  estimating  operational  availability  .  It  also  gives  the 
reliability  analyst  the  opportunity  to  determine  the  sensitivity  of  operational  availability  to 
changes  in  the  parameters  that  contribute  to  this  measure  of  readiness.  ' 

c.  Setup.  This  model  runs  on  a  DOS-based  computer.  On-hand  data  can  be  entered  into 
the  model  and  results  displayed  in  a  few  minutes. 

2.  Guide  to  Operation. 

a.  Equipment  Required. 

(1)  IBM  compatible  PC  oomputer. 

(2)  3.5"  diskdrive. 

(3)  Lotus  1-2-3  spreadsheet  software. 

b.  Installation. 

(1)  Turn  on  the  computer  and  activate  Lotus  1-2-3. 

(2)  Insert  the  3.3"  disk  containing  the  OPERAO  model  into  your  3.5"  disk  drive. 

(3)  From  the  Lotus  1-2-3  menu,  load  the  A  OPERAO  model  (or  B:OPHRAO  if  you're 
working  from  the  b:  drive)  by  entering  /FR,  then  backspace  to  erase  the  deftult  path.  Enter  A 
(or  B:)  and  press  the  [Enter]  key.  After  the  Lotus  1-2-3  files  are  shown,  select  the  OPERAO  file. 

c.  Operation. 

(1)  The  model  parameters  are  shown  below. 
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(2)  Move  the  cursor  to  the  cell  you  wish  to  change.  You  are  allowed  to  change  only 
the  data  marked  by  an  asterisk-other  data  represents  calculations  made  by  the  model. 

C  ESTIMATING  OPERATIONAL  AVAILABILITY 


OT 

10 

TT 

30 

MR 

0.3 

K 

1 

ALDT 

5 

MTBOMF 

100 

0.3 

0.05 

0.35 

0.333333 

0.116667 

0.883333 

V _ J 

Screen  #1 


d.  Definitions  for  Screen#!. 

OT:  Operating  Time. 

TT:  Total  Time. 

MR:  Maintenance  Ratio. 

K:  Ratio  of  Maintenance  Manhours  to  Clock  Hours. 

ALDT:  Administrative  Logistic  Downtime. 

MTBOMF:  Mean  Time  Between  Operational  Mission  Failure. 

e.  Explanation: 

(1)  Definitions. 

\  Operational  Availability. 

ALDT  Administrative  Logistic  Downtime. 
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DT 

Downtime. 

K 

« 

Ratio  of  Maintenance  Manhours  to  Clock  Hours. 

MR 

Maintenance  Ratio. 

MTBOMF 

Mean  Time  Between  Operational  Mission  Failure. 

OT 

Operating  Time. 

ST 

Standby  Time. 

TALDT 

Total  Administrative  Logistic  Downtime. 

TCM 

Total  Corrective  Maintenance. 

TPM 

Total  Preventive  Maintenance. 

TT 

Total  Time. 

(2)  The  basic  relationship  that  is  used  to  estimate  Operational  Availability  (A„)  is: 

Aq  ■  (OT  +  ST)/(QT  +  ST  +  TCM  +  TPM  +  TALDT  (1) 

(3)  The  entire  denominator  of  equation  (1)  is  Total  Time  (TT).  The  last  three  terms  of 
the  denominator  account  for  all  the  downtime  (DT).  The  numerator  of  this  equation  represents 
the  total  uptime  (UT)  for  the  system.  An  alternate  way  to  express  uptime  is  to  subtract  the  DT 
from  the  TT.  Equation  (1)  can  then  be  written  as: 

Ag  -  (TT-DTyrr  - 1  ■  dt/tt  (2) 

(4)  Equation  (2)  can  be  expressed  in  terms  of  the  "Downtime"  components  as: 

Aq-  1  •  (TCM  +  TPM  +  TALDT)/TT  (3) 

(3)  The  Maintenance  Ratio  (MR)  is  the  total  number  of  man-hours  of  maintenance  of 
direct  labor  in  some  particular  time  period  divided  by  the  total  operating  time  in  this  same  time 
period.  This  can  be  expressed  as: 

MR  -  K  *  (TCM  +  TPM)/OT  (4) 

where  K  is  the  ratio  of  Maintenance  Manhours  to  Maintenance  Clock  Hours.  For  example,  if 
•.wo  maintenance  men  work  from  12:00  noon  to  3:00  PM  (10  Maintneance  Manhours  during  a  5 
clock  hour  period  of  time)  then  K  *  10/5  -  2.  Equation  (4)  can  also  be  written  as: 
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TCM  +  TPM  -  (MR)  *  (OT)/K. 


(5) 


(6)  TALDT  can  be  estimated  by  considering  the  total  number  of  failures  in  some  given 
time  period  multiplied'by  the  average  logistical  down  time  for  each  failure  (ALDT).  This 
relationship  can  be  stated  as: 

TALDT  -  (OT)  *  (ALDTVMTBOMF.  (6) 

(7)  Equations  (3)  and  (6)  can  be  substituted  into  equation  (3).  By  fkctoring  (OT)  and 
(TT)  as  common  terms,  the  following  estimating  relationship  is  obtained: 

Aq-1  -  (OT/TT)  *  ((MR)/K  +  (ALDT/MTBOMF))  (7) 

(8)  Equation  (7)  can  be  used  to  assess  the  A ^  of  a  system  based  on  a  combination  of 
test  data  and  parameter  estimates  from  other  sources.  The  ratio  of  (OT/TT)  can  be  obtained 
from  the  operational  mode  summary  and  mission  profile  fbr  the  system.  The  estimates  for  the 
MR  and  MTBOMF  can  be  obtained  from  testing  and  engineering  analysis.  The  values  for  ALDT 
and  K  can  be  estimated  from  additional  logistical  analysis,  testing  and  field  reports  for  existing 
but  similar  systems.  Equation  (7)  also  gives  the  reliability  analyst  the  opportunity  to  determine 
the  sensitivity  of  Aq  to  changes  in  the  parameters  that  contribute  to  this  measure  of  readiness. 
This  can  help  determine  which  fhctors  can  be  traded  off  against  Aq  and  still  have  the  system  meet 
the  operational  requirement  of  readiness. 
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N.  INDIRECT  FIRE  EFFECTS 


1,  Introduction. 

a.  Description.  This  model  was  developed  by  the  Joint  Munitions  Effectiveness-Manuals, 
Surihce  to  Surface,  and  published  under  the  authority  of  the  Joint  Technical  Coordinating  Group 
for  Munitions  Effectiveness.  It  calculates  the  effects  of  artillery  and  mortar  fires  for  high 
explosive  and  improved  conventional  munitions.  Inputs  required  are:  number  of  volleys;  number 
of  rounds  per  volley;  round  reliability;  lethal  ares;  submunition  reliability,  volley  pattern 
dimensions;  target  area  dimensions;  number  of  submunitions  per  round;  angle  of  fill;  mean  point 
of  impact  and  precision  errors;  target  location  error,  and  pattern  adjustment  fictor.  Results  are 
displayed/printed  in  terms  of  fractional  damage  (amount  of  target  destroyed)  for  the  number  of 
volleys  used,  or  the  number  of  volleys  required  to  achieve  a  desired  fractional  damage. 

«  1 1 

b.  Limitations.  Effectiveness  estimates  for  a  large  number  of  volleys  may  be  unreliable 
due  to  the  methodology  used  in  this  model. 

o.  Applications.  Desktop  analytic  tool  for  determining  artillery  and  mortar  effects  on 
personnel  and  materiel  targets. 

d.  Setup.  This  model  runs  on  an  IBM  compatible  PC  computer.  Data  is  readily  available 
from  Joint  Technical  Coordinating  Group  publications  or  the  Army  Materiel  Systems  Analysis 
Activity  at  Aberdeen  Proving  Ground.  Data  can  be  entered  and  results  displayed  or  printed  in  a 
few  minutes. 

2.  Guide  to  Operation. 

a.  Equipment  Required. 

(1)  IBM  compatible  PC  computer. 

(2)  3. 5'1  diskdrive. 

(3)  Printer  (optional). 

b.  Installation. 

(1)  Turn  on  the  computer  and  get  to  the  DOS  prompt. 

(2)  Insert  the  3.5"  disk  containing  the  SUPERQUICKIE  n  program  into  your  disk 

drive. 


2/9 


(3)  From  the  DOS  prompt,  enter  the  command  A:  (or  B:  If  you're  operating  from 
the  B:  disk  drive). 

(4)  Enter  the  command  CD  SQ.  (to  change  to  the  SQ  directory  on  the  3. 5"  disk) 

(5)  Enter  the  command  SUPERQ 

(6)  If  you  receive  the  message,  "Enter  run  time  file  path,"  it  is  probably  because  you 
are  not  in  the  SQ  directory  of  the  A:  (or  B:)  drive.  You  cannot  run  this  program  simply  by 
entering  the  name  of  the  executable  file  and  path  (A:\SQ\SUPERQ.EXE). 

c.  Operation. 


(1)  The  first  two  screens  contain  publication,  destruction,  and  copywrite  information. 
Please  take  the  time  to  read  these  screens. 


UNCLASSIFIED 
2  DECEMBER  1991 


(ARMY)  PM  101<S0<I7<1 
(NAVY)  TWJ14-AA-MEM-OIO 
(USMQ  ntmi  10-I4-7H-4-A 
(USA F)  S1SI4.I7.I 


JMBM/IS  SUP1RQUICK1E  D  PROGRAM  (IUPERQ)  FOR  PERSONAL  COMPUTERS 


DISTRIBUTION  STATEMENT  &  DtaHtailtoa  uUkmtMi  to  UJ.  Omiwart 
■frill  wUy;  oawltoail  m*  2  Dnindm  1991.  OfearrraMtoftrtUi 
iVwwwnl  amt  b«  rdtawd  la  Dfcvelor,  AMSAA.  ABtti  AMXIY-J,  Abadan 
ProvtasOrouad.MD21005.J07l. 

DESTRUCTION  NOTICE.  Far  uiwImHUd,  limited  danuawali,  datroy  by  ay 
uXbod  IhM  will  pnvwl  dtaotoww  or  nonnrtnMttaa  of  On  dooumitt. 

REPRODUCTION.  Uim  any  art  cfcu*  Uta  flndbta  dUu 


PUBLISHED  UNDER  1701  AUTHORITY  OF  THE  JTCO/MB 


PRESS  THE  SPACE  BAR  TO  CONTINUE 
UNCLASSIFIED 


Screen  1 
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(2)  Super  Quickie  0  is  for  use  by  the  Department  of  Defense  only. 


UNCLASSIFIED 

SUPERQ 
VERSION  1.0 

1X02/91 

THU  PROGRAM  U  NOT  RELEASABLE  TO  AGENCIES  OUTSIDE  THE 
DEPARTMENT  OP  DEFENSE  WITHOUT  THE  PRIOR  APPROVAL  OF  THE 
APPROPRIATE  MEMBER  OF  THE  JOINT  TECHNICAL  COO”DINATINO 
OROUF  POE  MUNITIONS  EFFECTIVENESS  (JTCO/ME). 

SUPERQ  U  COMPILED  WITH  THE  MICROSOFT  QUICKBASIC  COMPILER. 
THU  COMPILER  AND  THE  BRUN43.EXE  FILE  ON  THU  DISKETTE  ARE 
COPYRIOHTED  BY  THE  MICROSOFT  CORPORATION. 

PRESS  THE  SPACE  BAR  TO  CONTINUE 

UNCLASSIFIED 


Screen  2 

(3)  The  next  prompt  asks  you  to  select  the  amount  of  time  you  want  messages 
displayed.  "Short"  is  recommended. 


MESSAOE  DUPLAY  TIME 


I  SHORT 
3  MEDIUM 
3  LONO 


ENTER  THE  NUMBER  OP  THE  DISPLAY  TIME  YOU  WANT  1  < 


Screen  3 

(4)  Next  you  will  be  asked  if  you  have  a  color  monitor. 


DO  YOU  HAVE  A  COLOR  MONITOR -Y/N  Y  < 


Screen  4 


(5)  If  you  have  a  color  monitor,  you  will  be  given  the  opportunity  to  change  colors. 


DO  YOU  WANT  TO  CHANGE  THE  COLORS- Y/N  N  < 


Screen  5 


(6)  The  next  prompt  warns  you  to  make  sure  that  the  3.5”  disk's  write-protect  tab  is 
disabled.  It  is  disabled  (will  allow  writing  to  it)  if  the  write-protect  tab  is  covering  the  small, 
rectangular  hole  on  your  disk.  If  you  can  see  through  the  hole,  slide  that  tab  over  the  hole. 
Additionally,  you  are  asked  to  enter  the  drive  that  has  the  Super  QuickC  U  program  on  it.  Do 
not  enter  a  colon  after  the  drive  letter  (do  not  enter  A:,  for  example,  Just  the  letter  A,  B,  or  C.) 


Screen  $ 


(7)  The  following  notices  will  be  displayed  next. 


II  NOTE  I! 

UNITS  OP  MEASUREMENT  MUST  BE  OONSWT1NT 
PRESS  THE  ESCAPE  KEY  AT  ANY  TIME  TO  EXIT  PROORAM 


PRESS  THE  SPACE  BAR  TO  CONTINUE 


Screen  7 

(8)  The  next  display  gives  you  the  options  you  have  with  Super  Quickie  H  Basically, 
you  can  choose  HE  or  ICM,  and  you  can  choose  to  input  the  number  of  volleys  and  have  Super 
Quickie  n  determine  the  fractional  damage  to  the  target  area,  or  you  can  input  the  fractional 
damage  desired  and  have  SuperQuickie  II  determine  the  number  of  volleys  required. 


282 


f  N 

OPTIONS 

1  HE/FD  ♦  DETERMINES  THE  EFFECTIVENESS  OP  HE  WEAPONS  WHERE  THE  NUMBER  OF 

VOLLEYS/SALVOS 18  INPUT  AND  THE  EXPECTED  FRACTIONAL  DAMAGE  OR 
CASUALTIES  It  OUTPUT. 

2  1CM/FD  •  DETERMINES  THE  EFFECTIVENESS  OF  ICVT3  WHERE  THE  NUMBER  OF  VOLLEYS 

OR  SALVOS  IS  INPUT  AND  THE  EXPECTED  FRACTIONAL  DAMAOE/CASU ALTOS  IS 
IS  OUTPUT, 

3  HB/NV.  DETERMINES  THE  EFFECTIVENESS  OF  HE  WEAPONS  WHERE  THE  DESIRED 

FRACTIONAL  DAMAOE/CASUALTBS  IS  INPUT  AND  THE  REQUIRED  NUMBER  OF 
VOLLEYMALVOS  IS  OUTPUT 

4  ICM/NV.  DETERMINES  THE  EFFECTIVENESS  OF  KM*  WHERE  THE  DESIRED  FRACTIONAL 

DAMAOE/CASUALTBS  IS  INPUT  AND  THE  REQUIRED  NUMBER  OF  VOLLEYS  OR 
SALVOS  IS  OUTPUT 


^  ENTER  THE  NUMBER  OF  THE  OPTION  YOU  WANT  TO  RUN  I 

Screen  8 


(9)  The  next  set  of  displays  inquest  the  inputs  for.  the  option  you  chose  above.  If  you 
decide  during  the  inputs  that  you  have  made  an  error  on  a  previous  entry,  don't  worry;  you  will 
get  a  chance  to  make  corrections  later— just  continue  with  the  remainder  of  the  entries.  Another 
point  worth  remembering:  some  entries  will  require  an  additional  prompt  at  the  bottom  of  your 
screen,  and  you  oould  be  frustrated  if  you  don't  notice  it.  The  prompt  may  be  waiting  for  a  yes  or 
no  response  and  you'll  be  trying  to  enter  a  regular  numerical  input  which  wont  be  accepted. 

(10)  The  following  entries  pertain  to  option  #1,  selected  above  in  Screen  8.  This 
option  calls  for  a  number  of  volleys  of  high  explosive  (HE)  rounds,  and  will  obtain  a  result  in 
terms  of  the  fraction  of  the  target  area  destroyed.  Fractional  damage  of  a  target  area  is  a  decimal 
number  which  equates  to  the  fraction  of  the  total  number  of  personnel  or  materiel  targets  in  the 
target  area  which  were  destroyed  by  the  indirect  fire.  For  example,  if  the  lethal  areas  entered 
below  are  for  personnel,  a  result  of  .23  means  that  23%  of  the  personnel  in  the  target  area  were 
killed.  It  doesn't  matter  how  many  personnel  are  actually  in  the  target  area.  Similarly  for 
materiel  targets.  If  the  lethal  areas  entered  are  for  tanks,  then  a  result  of  .19  means  that  19%  of 
the  tanks  in  the  target  area  were  destroyed. 
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(11)  The  first  input  is  the  number  of  rounds  per  volley.  Press  [Enter]  after  each  input. 


HE/TD 

* 

NUMBER  OF  ROUNDS  PER  VOLLEY/3ALVO  6 

NUMBER  OP  UNIQUE  VOLLBY/SALVO  SETS  (MAX  -5)  00000 

ROUND  RELIABILITY  00000 

SUBMUNmONRSUABUTY  00000 

VOLLEY /SAL VO  PATTERN  LBNOTH  (RNQ)  00000 

VOLLEY/SALVO  PATTERN  WIDTH  (DIPL)  00000 

AREA  TAROIT  LINOTH  (RNO)  OR  RADIUS  00000 

ARXATARORT  WIDTH  (DBPL)  00000 

NUMBER  OP  SUBMUNITIONS  PER  ROUND  00000 

ANClLBOFPAUiDRQRBES  00000 

SUB  MUNITION  RECTANOULAR  PATTERN  LZNOTR  (RNO)  OR  RADIUS  00000 
SUBMUNITION  RECTANOULAR  PATTERN  WIDTH  (DIPL)  00000 

MPI  HANOI  ERROR  PROBABLE  OR  CEP  00000 

MPI  DEFLECTION  ERkOR  PROBABLE 
PRBCtnCN  RANGE  ERROR  PROBABLE  OR  CEP 
PRECISION  DEFLECTION  ERROR  PROBABLE  00000 

TARGET  LOCATION  ERROR  (CEP)  00000 

PATTERN  ADJUSTMENT  FACTOR  (K)  00000 


Screen  9 


(12)  The  second  input  is  fbr  the  number  of  unique  volley  sets.  For  example,  if  you 
would  like  to  obtain  fractional  damage  results  fbr  firing  3  volleys  and  12  volleys  into  the  target 
area,  then  you  have  two  unique  volley  seta-one  set  of  3  volleys  and  one  set  of  12  volleys.  This 
model  will  automatically  add  an  additional  result  for  firing  one  volley.  In  this  example  we  will 
onter  two  volley  sets— 3  volleys  and  12  volleys-and  the  model  will  give  us  results  for  three  volley 
sets:  that  is,  results  fbr  lvoiley,  3volleys,  and  12  volleys.  Note  that  there  Is  a  maximum  of  five 
unique  volley  sets.  You  can  enter  five  volley  sets,  and  the  model  will  add  the  sixth  result  for  one 
volley. 


(13)  The  second  input  (the  number  of  unique  volley  sets)  is  one  of  those  inputs  which 
will  produce  an  almost  inconspicuous  prompt  at  tho  bottom  of  the  screen.  This  prompt  will  ask 
you  to  enter  the  number  of  volleys  you  waiit  fired  fbr  each  unique  volley  set  In  our  case,  we're 
going  to  enter  two  unique  volley  sets,  and  the  prompt  at  tha  bottom  of  the  screen  will  appear, 
asking  us  to  enter  the  number  of  volleys  fbr  volley  set  1,  and  then  another  prompt  will  appear  in 
the  same  place,  asking  us  to  enter  the  number  of  volleys  fbr  volley  set  2. 


(14)  The  second  entry  (number  of  unique  volley  sets)  and  the  first  prompt  at  the 
bottom  of  the  screen  looks  like  Screen  10, 


KE/FD 


NUMBER  OF  ROUNDS  PER  VOLLEY/SALVO  « 

NUMBER  OF  UNIQUE  VOLLEY/SALVO  SETS  (MAX  ■  J)  2 

ROUND  RELIABILITY  00000 

SUBMUNmON  RELIABILITY  00000 

VOLLEY/SALVO  PATTERN  LENOTH  (RNO)  00000 

VOLLBY/8ALVO  PATTERN  WIDTH  (DEFL)  00000 

AREA  TARGET  LENOTH  (RNO)  OR  RADIUS  00000 

AREA  TAROET  WIDTH  (DEFL)  \  00000 

NUMBER  OF  SUBMUNmONS  PER  ROUND  '  00000 

ANOLE  OF  FALL,  DBOREES  00000 

SUBMUNITIONRECTANOULAR  PATTERN  LENOTH  (RNO)  OR  RADIUS  00000 
SUBMUNITION  RECTANOULAR  PATTERN  WIDTH  (DEFL)  00000 

MPIRANOE  ERROR  PROBABLE  OR  CEP  00000 

MPI  DEFLECTION  ERROR  PROBABLE  00000 

PRECISION  RANGE  ERROR  PROBABLE  OR  CEP  00000 

PRECISION  DEFLECTION  ERROR  PROBABLE  00000 

TAROET  LOCATION  ERROR  (CEP)  00000 

PATTERN  ADJUSTMENT  FACTOR  (K)  00000 


Screen  10 


(15)  The  second  prompt  at  the  bottom  of  the  screen  will  ask  for  the  second  volley  size. 
In  our  example,  12  volleys  will  be  entered  for  the  size  of  the  second  volley  set,  as  follows. 


HE/FD 


NUMBER  OF  ROUNDS  PER  VOLLEY/SALVO  6 

NUMBER  OP  UNIQUE  VOLLEY/SALVO  SETS  (MAX  -  5)  J 

ROUND  RELIABILITY  00000 

SUBMUNTOON  RELIABILITY  00000 

VOLLEY/SALVO  PATTERN  LENOTH  (RNO)  00000 

VOLLEY/SALVO  PATTERN  WIDTH  (DEFL)  00000 

AREA  TAROET  LENOTH  (RNO)  OR  RADIUS  00000 

AREA  TARGET  WIDTH  (DEFL)  00000 

NUMBER  OP  SUBMUNITIONS  PER  ROUND  00000 

ANOLE  OF  FALL,  DEGREES  00000 

SUBMUNITION  RECTANOULAR  PATTERN  LENOTH  (RNO)  OR  RADIUS  00000 
SUBMUNITION  RECTANOULAR  PATTERN  WIDTH  (DEFL)  00000 

MPIRANOE  ERROR  PROBABLE  OR  CEP  00000 

MPI  DEFLECTION  ERROR  PROBABLE  00000 

PRECISION  RANQB  ERROR  PROBABLE  OR  CEP  00000 

PRECISION  DEFLECTION  ERROR  PROBABLE  00000 

TARGET  LOCATION  ERROR  (CEP)  00000 

PATTERN  ADJUSTMENT  FACTOR  (K)  00000 

ENTER  VOLLEY/SALVO  SIZE  NUMBER  2  *  12 


Screen  1 1 


(16)  The  next  input  is  for  round  reliability.  Notice  at  the  bottom  of  the  list  the  volley 
sizes  /tre  now  displayed,  including  1  volley  added  by  the  model. 


NUMBER  OF  RO'  TNf  3  PER  VOLLEY/3  ALVO 
NUMBER  OF  UNIQ  m  VOLLEY/SALVO  SETS  (MAX «  S) 

ROUND  MLlABtLDY 
tUBMUNITION  RELIABILITY 
VOLLEY/SALVO  PATTERN  LENGTH  (RNO) 

VOLL8Y/SALVO  PATTERN  WIDTH  (DEFL) 

ARRATAROHT  LENOTH  (RNO)  OK  RADIUS  ». 

AREA  TAROET  WIDTH  (DEFL) 

NUMBER  OF  SUBMUNITIONS  PER  ROUND 
ANOUOPPALL,  DEGREES 

SUBMUNITION  RECTANGULAR  PATTERN  LENOTH  (RNO)  or  radius 
SUBMUNmON  RECTANOULAR  PATTERN  WIDTH  (DEFL) 

MPI  HANOI  ERROR  PROBABLE  OR  CEP 
MPI  DEFLECTION  ERROR  PROBABLE 
PRECISION  RANOE  ERROR  PROBABLE  OR  CEP 
PRECISION  DEFLECTION  ERROR  PROBABLE 
TARGET  LOCATION  ERROR  (CEP) 

PATTERN  ADJUSTMENT  FACTOR  (X) 

VOLLEY/SALVO  SIZES  1.3.13 


Screen  12 


2 

0.96 

oocoo 

00000 

00000 

ooouo 

.00030 

'00000 

00000 

00000 

00000 

00000 

00000 

00000 

ooooo 

OOQ0O 


(17)  The  next  item  in  the  list,  submunition  reliability,  will  now  display  N/A  in  the 
right  column  because  we  selected  the  HE  option.  Submunition  reliability  is  used  for  ICM  only. 


NUMBER  OF  ROUNDS  PER  VOLLEY/SALVO  6 

NUMBER  OF  UNIQUE  VOLUY/IALVO  SETS  (MAX -S)  2 

ROUND  RELIABILITY  0.96 

SUBMUNITION  RELIABILITY  N/A 

VOLLEY/SALVO  PATTERN  LENOTH  (RNO)  00000 

VOLLEY /SALVO  PATTERN  WIDTH  (DEFL)  00000 

AREA  TAROFT  LENOTH  (RNO)  OR  RADIUS  OOOOO 

AREA  TARGET  WIDTH  (DEFL)  WO 

NUMBER  OP  SUBMUNTTIONS  PER  ROUND  OOOOO 

ANOU  OF  FALL,  DEOREZ3  00000 

SUBMUNITION  RECTANOULAR  PATTERN  LENOTH  (RNO)  OR  RADIUS  00000 
SUBMUNITION  RECTANOULAR  PATTERN  WIDTH  (DBFL)  00000 

MPI  RANOE  ERROR  PROBABLE  OR  CEP  00000 

MP!  DEFLECTION  ERROR  PROBABLE  00000 

PRECISION  RANGE  ERROR  PR0BABU  OR  CEP  00000 

PRECISION  DEFLECTION  ERROR  PROBABLE  00000 

TARGET  LOCATION  ERROR  (CEP)  00000 

PATTERN  ADJUSTMENT  FACTOR  (K)  00000 

VOLLEY/SALVO  SIZES  1,3,12 


(18)  The  volley/salvo  pattern  length  (In  the  range  direction)  and  width  (in  deflection) 
are  entered  next.  These  are  the  dimensions  of  the  volley  pattern  in  the  impact  area. 


HE/FD 


NUMBER  OP  ROUNDS  PER  VOLLEY/SALVO  « 

NUMBER  OF  UNIQUE  VOLLEY/SALVO  SETS  (MAX  ■  J)  2 

ROUND  RELIABILITY  0.96 

SUBMUNITION  RELIABILITY  N/A 

VOLLEY/8  ALVO  PATTERN  LENOTH  (UNO)  290 

VOLLBY/BALVO  PATTERN  WIDTH  (DEPL)  99 

AREA  TARGET  LENOTH  (RNO)  OR  RADIU8  . 00000 

AREA  TARGET  WIDTH  (DBFL)  .00000 

NUMBER  OP  SUBMUNITIONS  PER  ROUND  00000 

00000 

SUBMUNTTION  RECTANOULAR  PATTERN  LENOTH  (RNO)  OR  RADIUS  00000 
SUBMUNIT10N  RECTANGULAR  PATTERN  WIDTH  (DSL)  00000 

MPI  RANGE  ERROR  PROBABLE  OR  CEP  00000 

MPI  DEFLECTION  ERROR  PROBABLE  00000 

PRECISION  RANOE  ERROR  PROBABLB  OR  CBP  00000 

PRECUION  DEFLECTION  ERROR  PROBABLE  00000 

TAROBT  LOCATION  ERROR  (CEP)  00000 

PATTERN  ADJUlTMENr  FACTOR  (K)  00000 

VOLLEY/SALVO  SIZES  1,3,12 

\— — —  J 


Screen  14 


(19)  The  next  entry  is  for  the  area  target  length  (range  direction),  or  the  radius  of  the 
target  area.  A  prompt  at  the  bottom  of  the  screen  asks  you  if  you  entered  a  radius  or  not. 


HE/FD 


NUMBER  OP  ROUNDS  PER  VOLLEY/SALVO  « 

NUMBER  OF  UNIQUE  VOLLBY/EALVO  SETS  (MAX  ■  9)  2 

ROUND  RELIABILITY  0S6 

SUBMUNI  nON  RELIABILITY  N/A  . 

VOLLEY/SALVO  PATTERN  LENGTH  (RNO)  290 

VOL.UBY/BALVO  PATTERN  WIDTH  (DRFL)  09 

AREA  TAROST  LENGTH  (RNO)  OR  RADIUS  100 

ARIA  TAROBT  WIDTH  (DPFL)  00000 

NUMBER  OF  SUBMUNITIONS  PER  ROUND  00000 

ANGLE  OF  PALL,  DEGREES  00000 

SUBMUNTTION  RECTANGULAR  PATTERN  LENOTH  (RNO)  OR  RADIUS  00000 
SUBMUNITION  RECTANGULAR  PATTERN  WIDTH  (DEFL)  00000 

MPI  RANGB  ERROR  PROBABLB  OR  CEP  00000 

MPI  DEFLECTION  ERROR  PROBABLE  00000 

PRECISION  RANOE  ERROR  PROBABLB  OR  CEP  00000 

PRECISION  DEFLECTION  ERROR  PROBABLE  00000 

TAROBT  LOCATION  ERROR  (CEP)  00000 

PATTERN  ADJUSTMENT  FACTOR  (K)  00000 

VOLLEY/SALVO  SIZES  1,3,12 

DID  YOU  ENTER  A  RADIUS  •  Y/N  N 


Screen  15 
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(20)  The  target  width  (deflection)  is  100  meters  for  this  example.  As  soon  as  you 
enter  this  number,  the  next  line,  for  number  of  submunitions  per  round  will  be  shown  as  N/A. 


NUMBER  OP  ROUNDS  PER  VOLLP  Y/8ALVO 
NUMBER  OF  UNIQUE  VOLLEY/SALVO  SETS  (MAX  -  3) 

ROUND  RELIABILITY 

SUB  MUNITION  RELIABILITY 

VOLLEY/SALVO  PATTERN  LENOTH  (RNO) 

VOLLBY/SALVO  PATTERN  WIDTH  (DEFL) 

AREA  T  ARGET  LENOTH  (RNO)  OR  RADIUS 
AREA  TARGET  WIDTH  (DBFL) 

NUMBER  OP  EUBMUNniONS  PER  ROUND 
ANOLS  OP  FALL  DEOREEI 

SUB  MUNITION  RECTANGULAR  PATTERN  LENOTH  (RNO)  OR  RADIUS 
SUBMUNTHON  RECTANGULAR  PATTERN  WIDTH  (DBFL) 

MPI RANOB  ERROR  PROBABU  OR  CEP 
MPt  DEPUCTION  ERROR  PROBABLE 
PRECISION  RANGE  ERROR  PROBABLE  OR  CEP 
PRECB10N  DEFLECTION  ERROR  PROBABLE 
TARGET  LOCATION  ERROR  (CEP) 

PATTERN  ADJUSTMENT  FACTOR  (K) 

VOLLEY/SALVO  SIZES  1, 3. 12 


Screen  16 


$ 

2 

0  3* 
N/A 
230 
93 
100 
100 
N/A 
00000 
00000 
00000 
00900 

ooooo 

00000 

ooooo 

coooo 

ooooo 


(21)  The  angle  of  foil  is  entered  next,  and  the  next  two  inputs  will  be  shown  as  N/A. 


r  \ 


NUMBER  OP  ROUNDS  PER  VOLLEY/SALVO 
NUMBER  OP  UNIQUE  VOLLEY/SALVO  BETS  (MAX  -  3) 

ROt  ND  RELIABILITY 
SUBMUNTTION  RELIABILITY 
VOLLEY/SALVO  PATTERN  LENOTH  (RNOl 
VOLLEY/IALVO  PATTERN  WIDTH  (DEIT.) 

AREA  TARGET  LENOTH  (RNO)  OR  RADIUS 
AREATAROET  WIDTH  (DEPL) 

NUMBER  CP  SUB  MUNITIONS  PER  ROUND 
ANOLE  OP  FALL,  DECREES 

SUBMUNTHON  RECTANGULAR  PATTERN  LENOTH  (RNO)  OR  RADIUS 
SUBMUNTTION  RECTANGULAR  PATTERN  WIDTH  (DBFL) 

MPI  1ANOS  ERROR  PROBABLE  OR  CTf 
MPI  DEFLECTION  ERROR  PROBABLE 
FREOEHN  RANOE  ERROR  PROBABLE  OR  CEP 
PtlCSHON  DEFLECTION  ERROR  PROBABLE 
TARGET  LOCATION  ERROR  (CEP) 

PATTERN  ADJUSTMENT  FACTOR  (X) 

VOLUYlBAt  VO  SIZES  1,3,12 


6 

2 

03* 

N/A 

230 

93 

100 

100 

N/A 

47 

N/A 

N/A 

ooooo 

ooooo 

ooooo 

ooooo 

ooooo 

ooooo 
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(22)  For  the  mean  point  of  impact  (IvlPI)  errors,  the  first  entry  will  cause  a  prompt  at 
the  bottom  of  the  screen,  asking  you  if  you  entered  a  circular  error  probable.  In  this  case,  yes. 


NUMBER  OF  ROUNDS  PER  VOLLEY/SALVO 
NUMBER  OP  UNIQUE  VOLLEY/SALVO  SETS  (MAX  -  S) 

ROUND  RELIABILITY 

SUB  MUNITION  RELIABILITY 

VOLLEY/SALVO  PATTERN  LENGTH  (RNO) 

VOLLEY/SALVO  PATTERN  WIDTH  (DEFL) 

AREA  TARGET  LENOTH  (RNO)  OR  RADIUS 
AREA  TARGET  WIDTH  (DEFL) 

NUMBER  OP  SUB  MUNITIONS  PER  ROUND 
ANOU  OP  PALL,  DEGREES 

SUBMUNITION  RECTANGULAR  PATTERN  LENGTH  (RNO)  OR  RADIUS 
SUBMUNITION  RECTANOULAR  PATTERN  WIDTH  (DEFL) 

MPI RANOE  ERROR  PROBABLE  OR  CEP 
MPI  DEFLECTION  ERROR  PROBABLE 
PRECISION  RANOE  ERROR  PROBABLE  OR  CEP 
PRECISION  DEFLECTION  ERROR  PROBABLE 
TAROET  LOCATION  ERROR  (CEP) 

PATTERN  ADJUSTMENT  FACTOR  (K) 

VOLLEY/SALVO  SIZES  1,3.13 

DID  YOU  ENTER  A  CEP  •  Y/N  Y 


Screen  18 


2 

OFtf 

N/A 

250 

93 

100 

too 

N/A 

47 

N/A 

N/A 

40 

00400 

00000 

00000 

00000 

00000 


(23)  Similarly  for  the  precision  errors. 


r 


NUMBER  OP  ROUNDS  PER  VOLLEY/SALVO 
NUMBER  OP  UNIQUE  VOLLEY/SALVO  SETS  (MAX  -  J) 

ROUND  RELIABILITY 
SUBMUNITION  RELIABILITY 
VOLLEY/SALVO  PATTERN  LENOTH  (RNO) 

VOLLEY/SALVO  PATTERN  WIDTH  (DEFL) 

AREA  TAROET  LENOTH  (RNO)  OR  RADIUS 
AREA  TAROET  WIDTH  (DEFL) 

NUMBER  OP  SUBMUNmONI  PER  ROUND 
ANOU  OP  FALL,  DEGREES 

SUBMUNITION  RECTANOULAR  PATTERN  LENOTH  (RNO)  OR  RADIUS 
SUBMUNITION  RECTANOULAR  PATTERN  WIDTH  (DBPL) 

MPI  RANOE  ERROR  PROBABLE  OR  CEP 
MR  DEFLECTION  ERROR  PROBABLE 
PRECISION  RANOE  ERROR  PROBABU  OR  CEP 
PRECISION  DEFLECTION  ERROR  PROBABU 
TAROET  LOCATION  ERROR  (CEP) 

PATTERN  ADJUSTMENT  FACTOR  (K) 

VOLLEY/SALVO  SIZES  1,3,12 

DO)  YOU  ENTER  A  CEP  ■  Y/N  Y 


3 

0.94 

N/A 

330 

93 

100 

100 

N/A 

47 

N/A 

N/A 

40 

00000 

42 

00000 

00000 

00000 


Screen  19 


(24)  The  target  location  error  is  entered  as  0  meters. 


NUMBER  OF  ROUNDS  PER  VOLLEY/SALVO 
NUMBER  OF  UNIQUE  VOLLEY/SALVO  SETS  (MAX  -  5) 

ROUND  RHJABHJTY 

SUB  MUNITION  RELIABILITY 

VOUBY/EALVO  PATTERN  LENOTH  (RNO) 

VOLLBY/SALVO  PATTERN  WIDTH  (DEPL) 

AREATAROBT  LENOTH  (RNO)  OR  RADIUS 
AREA  TAROET  WIDTH  (DEPL) 

NUMBER  OF  SUBMUNITIONI  PER  ROUND 
ANOLR  OF  FAJU,  PROBERS 

SUBMUNITION  RECTANOULAR  PATTERN  LENOTH  (RNO)  OR  RADIUS 
SUBMUNTTION  RECTANOULAR  PATTERN  WIDTH  (DBFL) 

MPI RANOE  ERROR  PROBABLE  OR  CEP 
MFI  DEFLECTION  ERROR  PROBABLE 
PRECISION  RANOE  ERROR  PROBABLE  OR  CEP 
PRECISION  DEFLECTION  ERROR  PROBABLE 
TAROET  LOCATION  ERROR  (CEP) 

PATTERN  ADJUSTMENT  FACTOR  (K) 

VOLLBY/BALVO  SIZES  1.3, 13 


Screen  20 


2 

0.M 
N/A 
2  SO 
93 
100 
100 
N/A 
47 
h'.A 
N/A 
40 

00000 
43 , 

00000 

0 

00000 


(25)  The  pattern  adjustment  factors  mty  be  obtained  from  the  JTCG.  In  our  example 
the  pattern  adjustment  factor  is  4. 


NUMBER  OF  ROUNDS  PER  VOLLBY/S  ALVO 
NUMBER  OF  UNIQUE  VOLLEY/SALVO  SITS  (MAX  -  3) 

ROUND  RELIABILITY 
SUBMUNITION  RELIABILITY 
VOLLBY/SALVO  PATTERN  LENOTH  (RNO) 

VOLLEY/SALVO  PATTERN  WIDTH  (DEPL) 

AREA  TARGET  LENOTH  (UNO)  OR  RADIUS 
AREA  TAROET  WIDTH  (DBFL) 

NUMBER  OF  lUBMUNITIONS  FIR  ROUND 
ANOLE  OPFALL,  DEGREES 

SUBMUNITION  RECTANOULAR  PATTERN  LENOTH  (RNO)  OR  RADIUS 
SUBMUNITION  RECTANOULAR  PATTERN  WIDTH  (DEPL) 

MPI  RANOE  ERROR  FROBABLR  OR  CSF 
MPI  DEFLECTION  ERROR  PROBABLE 
PRECISION  RANOE  ERROR  PROBABLE  OR  CSF 
PRECISION  DEFLECTION  ERROR  PROBABLE 
TAROET  LOCATION  ERROR  (CEF) 

PATTERN  ADJUSTMENT  FACTOR  (K) 

VOLLEY/SALVO  SIZES  1,3,13 


3 

0.9* 

N/A 

3S0 

93 

100 

100 

N/A 

47 

N/A 

N/A 

40 

00000 

43 

00000 

0 

ooooo 
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(26)  Having  filled  the  first  set  of  prompts,  the  entries  for  lethal  areas  are  next.  After 
you  enter  the  lethal  areas,  you  will  get  a  prompt  at  the  bottom  of  the  screen  asking  if  you  entered 
personnel  lethal  areas  rather  than  lethal  areas  for  materiel  targets. 


r 

LETHAL  AREAS 

LETHAL  AREAS  MUST  BE  ENTERED  IN  DECREASING  ORDER 

LETHAL  AREA  OF  POSTURE  1  OR  MATERIEL  TOT  1 

330.0 

LETHAL  AREA  OF  POSTURE  2  OR  MATERIEL  TOT  2 

400.0 

LETHAL  AREAOF  POSTURE  3  OR  MATERIEL  TOT  3 

130.0 

’i 

DID  YOU  ENTER  PERSONNEL  LETHAL  AREAS  •  Y/N 

.  1 

Y 

V 

Screen  22 


(27)  The  last  set  of  entries  pertain  to  the  percent  of  personnel  in  each  of  the  above 
postures  during  the  first  and  subsequent  volleys.  For  example,  in  the  above  screen  you  might 
have  entered  three  lethal  areas  for  personnel  who  are  standing,  crouching,  and  prone.  The 
scenario  might  dictate  that  when  the  first  volley  lands,  80%  of  the  personnel  in  the  target  area  are 
standing,  10%  are  crouching,  and  10%  are  in  a  prone  position.  However,  for  subsequent  volleys, 
10%  of  the  personnel  are  still  standing  (running,  moving  to  a  new  position,  eto...),  50%  are 
crouching,  and  40%  are  In  a  prone  position.  Entries  to  match  this  scenario  ar  e  as  follows. 


LETHAL  AREAS 

LETHAL  AREAS  MUST  BE  ENTERED  IN  DECREASING  ORDER 

LETHAL  AREA  OP  POSTURE  I  OR  MATERIEL  TOT  I  530.0 

LETHAL  AREA  OP  POSTURE  1  OR  MATERIEL  TOT  2  400,0 

LETHAL  AREA  OP  POSTURE  3  OR  MATERIEL  TOT  3  130.0 

FOR  POSTURE  SEQUENCING,  THE  FRACTION  OP  PERSONNEL  IN  BACH  PCS'  IRE  MUST 
UE  BETWEEN  0.0  AND  1.0.  THE  SUM  OF  ALL  THE  POSTURES  MUST  BQUA»  .0 

FRACTION  OP  PERSONNEL  POSTURE  l  DURING  FIRST  VOLVAL  0,t0 

FRACTION  OP  PERSONNEL  POSTURE  3  DURING  FIRST  VOL/SAL  0.10 

FRACTION  OP  PERSONNEL  POSTURE  3  DURINO  FIRST  VOl/SAL  0.10 


FRACTION  OP  PERSONNEL  POSTURE  1  FOR  SUBSEQUENT  VOL/SAL 
FRACTION  OP  PERSONNEL  POSTURE  3  FOR  SUBSEQUENT  VOL/SAL 
FRACTION  OP  PERSONNEL  POSTURE  3  FOR  SUBSEQUENT  VOL/SAL 


Screen  23 


(28)  All  inputs  will  now  be  displayed,  and  you  are  given  the  opportunity  to  make 
changes  in  the  data  prior  to  calculating  the  results. 


INPUTS 


1  NUMBER  OF  RDS  PER  VOL/SAL  6 

2  NUM  OF  UNIQUE  VOL/SAL  SETS  2 

3  ROUND  RELIABILITY  .96 

4  SUB  MUNITION  RELIABILITY  N/A 

3  VOL/SAL  PATTERN  (RNO)  2S0.00 

6  VOL/SAL  PATTERN  (DRFL)  93.00 

1  TAROET  LENGTH  (RNO)  100.00 

•  TARGET  WIDTH  (DETL)  100.00 

9  NUM  OP  8UBMUNITION8  PER  RD  N/A 

10  ANGLE  OP  FALL  47.00 

11 SUBMUNTITON  PATTERN  (RNO)  N/A 

12  SUB  MUNITION  PATTERN  (DEFL)  N/A 

13  MP1CEP  40.00 

14  MI?  DE?  N/A 


15  PRECISION  CEP 

16  PRECISION  DEP 

17  TARGET  LOCATION  ERROR 
II  PATTERN  AD)  FACTOR  (K> 

19  POSTURE  l  LETHAL  AREA 

20  POSTURE  3  LETHAL  AREA 

21  POSTURES  LETHAL  AREA 

22  POSTURE  1  FIRST  VOL/SAL 

23  POSTURE  2  PIRJT  VOL/SAL  i 

24  POSTURE  3  FIRST  VOL/SAL 

23  POSTURE  l  AFTER  FIRST  VOL/SAL 

26  POSTURE  2  AFTER  FIRST  VOL/SAL 

27  POSTURE  3  AFTER  FIRST  VOL/SAL 
21  VOL/SAL  SIZES  1  3  12 


42.00 

N/A 

0.00 

4 

330.00 

400.00 

130.00 

0.10 

0.10 

0.10 

0.10 

0.30 

0.40 


DOYOU  WANTTOMAKE  ACHANOE  ■  Y/N  N 


Screen  24 


(29)  If  you  had  wanted  to  make  a  change  in  the  inputs,  you  would  have  responded 
with  a  "Y”  to  the  prompt  at  the  bottom  of  Screen  24.  However,  assuming  that  no  changes  need 
to  be  made,  a  response  of  "N'\  as  in  Screen  24,  will  produce  the  desired  results  In  terms  of 
fractional  damage. 


(30)  Whether  you  press  "P"  to  print  the  results  shown  on  the  screen,  or  press  the 
spacebar  to  continue,  the  following  screen  is  displayed. 


(31)  Entering  a  3,  above,  will  return  you  to  the  a:>  prompt  after  the  next  screen. 


UNCLASSIFIED 


RECOMMENDED  CHANGES,  COMMENTS  OR  CORRECTIONS  TO 
IMPROVE  THIS  PROGRAM  SHOULD  BE  ADDRESSED  TO: 


DIRECTOR 

U.S.  ARMY  MATERIEL  SYSTEMS  ANALYSIS  ACTIVITY 
ATTN:  AMXSY-J 

ABERDEEN  PROVING  GROUND,  MD  21003-5071 


PRESS  THE  SPACE  BAR  TO  RETURN  TO  SYSTEM 


UNCLASSIFIED 


Screen  27 


(32)  Upon  pressing  the  space  bar,  you  will  be  returned  to  the  A:\SQ>  prompt.  If  you 
need  to  return  to  the  C:  drive  and  prompt,  simply  type  C:  and  press  the  [Enter]  key. 
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An  Application  of  Generalized  p-values 
in  Tank  Gun  Accuracy  Research 


David  W.  Webb 

US  Army  Research  Laboratory 
Weapons  Technology  Directorate 
Aberdeen  Proving  Ground,  MD  21005 


By  optimally  rotating  a  tank  cannon  to  counteract  gravity 
droop  and  the  cannon's  dynamic  response  during  firing,  the 
idea  of  "dynamic  indexing"  was  believed  to  be  a  major  step  in 
the  reduction  of  between-tube  variability,  a2T.  Using  an 
indirect  approach  to  compare  the  between-tube  variance 
components  for  dynamically  indexed  tubes  (DlT's)  and  standard 
tubes  (ST's),  an  earlier  analysis  of  the  test  data  failed  to 
show  a  difference  between  a2T..DIT  and  alT.ST.  seeking  a  more 
direct  comparison  of  independently  obtained  between-tube 
varianae  components,  Xhou  and  Mathews  proposed  a  test  variable 
based  on  the  recently  developed  concept  of  generalized 
p-values.  This  paper  describes  how  this  generalized  test 
variable  is  employed  to  compare  two  between-tube  variance 
components  taken  from  independent  mixed  models.  Finally,  a 
comparison  is  made  between  the  conclusions  drawn  from  the 
original  analysis  and  a  reanalysis  of  the  field  test  using 
Zhou  and  Mathews'  generalized  p-value  approach. 


Introduction 

U.S.  Army  experiments  conducted  in  the  late  1980 's  showed  that 
between-tube  variability  is  a  significant  contributor  to  the 
overall  error  in  the  M1A1  series  tank.  In  an  attempt  to  reduce 
this  variability,  researchers  took  advantage  of  the  fact  that  each 
gun  tube  has  its  own  unique  curvature  by  proposing  that  gun  tubes 
be  dynamically  indexed  (Schmidt,  et.  al.,  1988).  That  is,  each  gun 
tube  is  rotated  about  the  center  boreline  so  that  its  curvature 
counteracts  both  the  gravitational  droop  and  the  whipping  motion  of 
the  tube  immediately  after  trigger  pull.  This  whipping  motion 
(more  properly  referred  to  as  the  dynamic  response)  is  caused  by  a 
vertical  difference  in  the  centers-of-gravity  of  the  gun  tube  and 
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the  breech  block  which  supports  the  gun  tube.  Normally,  a  gun  tube 
is  only  rotated  to  lessen  the  effects  of  the  gravitational  droop. 
This  is  known  as  standard  indexing. 

In  1990,  the  U.S.  Army  Ballistic  Research  Laboratory  (now  part 
of  the  U.S.  Army  Research  Laboratory)  conducted  a  large-scale  field 
test  whose  primary  purpose  was  to  determine  if  dynamic  indexing 
would  reduce  the  between-tube  variability  of  the  M1A1  series  tank 
(Webb,  et.  al.,  1991).  This  costly  experiment  included  four  types 
of  ammunition,  four  tanks,  twenty  standard  tubes  (ST's),  and  twenty 
dynamically  indexed  tubes  (DIT's).  The  response  recorded  from  each 
round  was  its  horizontal  and  vertical  jump,  where  jump  is  defined 
as  the  distance  from  the  aimpoint  to  the  impact  point  after  all 
known  corrections  (such  as  wind  and  muzzle  velocity)  have  been 
applied. 

A  separate  and  independent  analysis  of  jump  was  performed  for 
all  eight  (2  x  4)  combinations  of  direction  and  ammunition  type. 
Table  1  shows  an  arrangement  of  the  data  collected  for  each  subset 
of  the  entire  test.  In  this  table,  we  see  that  the  fixed  factor 
Tube  Type  and  the  random  factor  Tank  were  crossed,  while  the  random 
factor  Tube  was  nested  within  Tube  Type.  Three  rounds  were  fired 
per  cell. 

To  obtain  an  estimate  of  between-tube  variability  for  both 
dynamically  indexed  and  standard  tubes,  two  independent  mixed 
linear  models  were  applied  to  Table  1  (one  for  each  tube  type) . 
For  each  type  of  tube,  the  linear  model  is: 

+€k{lj)  ' 


where 

1)  zljk  is  the  jump  of  the  kth  round  from  the  jth  tube  on  the 
ith  tank,  measured  in  mils; 

2)  m  is  the  overall  mean; 

3)  dj  is  the  effect  of  the  ith  tank  for  i  »  1,  2,  3,  4; 

4)  $j(1)  is  the  effect  of  the  jth  tube  on  the  ith  tank  for  j  - 
1,  2,  3,  4,  5; 
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Table  1.  Data  matrix  for  each  combination  of  direction  and 
ammunition  type.  Each  "z"  represents  a  jump  value. 
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5) 

P  JU) 

~  N  (0 , 

o2t)  ; 

6) 

*k<11) 

is  the 

error 

associated  with 

the 

kth  round  from  the 

jth  tube  on 

the  ith 

tank  for  k  -  1, 

2, 

3 ,  and ; 

7) 

*k<lj> 

-  N(0, 

O2 r)  . 

Comparison  of  Betveen-Tube  Variabilities*  An  Indirect  Approach 


Tha  goal  of  the  statistical  analysis  was  to  conduct  a  one¬ 
sided  hypothesis  test  comparing  the  between-tube  variabilities, 
namely, 


H0x  0?-ST  ^  o T-DIT 


2  2 

VS  i  Ht  i  C  r-ST  >  0  T-DIT t 


or,  equivalently, 


«■«,!  °™T  1  1 

0  r-oir 


VS  i 


_2 

cr-ar 


Hai  a  >  1. 

°r-nr 


Superficially,  the  ratio 


MS  t-dit 


SST.s  r 

SST-DIT 


whore  F*  follows  an  r  distribution  with  16  numerator  and 
denominator  degrees  of  freedom,  may  appear  to  be  a  proper  test 
statistic  for  H0.  However,  examination  of  the  expected  mean 
squares  for  each  model  shows  that  F*  is  actually  a  test  statistic 
for 


2  2 

«  or-sr+30jj  -ST 

H0\  5 - 

0T-DIT^an-DIT 


VS. 


2  2 
Qt-S?+30h-3T  k 
— - - -  ^  x  . 

Or-ojr+'SO/j-cjy 


Under  the  assumption  that  o*R.Dn  and  o*R.ST  are  equal,  F*  serves 
as  an  indirect  test  statistic  for  H0,  since  significantly  large 
values  of  F"  would  be  attributable  to  differences  in  the  between- 
tube  variabilities  and  not  the  between-round  variabilities. 


The  assumption  of  equal  between-round  variabilities  can  be 
tested  by  the  statistic 
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I 


jrt  m  MSr-st  m  SSr-st 
MSr-dit  SSr-dit 

where  Ff  follows  an  F  distribution  with  40  degrees  of  freedom  in 
both  the  numerator  and  denominator.  If  this  assumption  is  not 
rejected,  then  one  may  proceed  with  the  computation  of  F*  to  test 

H0. 


On  the  other  hand  if  the  assumption  of  equal  between-tube 
variance  is  rejected,  then  F *  is  more  prone  to  Type  I  or  Type  II 
errors.  These  errors  are  due  to  the  presence  of  the  nuisance 
parameters,  o2R.D IT  and  o2R.ST,  in  the  expected  value  of  the  test 
statistic.  How  should  the  analyst  proceed  if  this  is  the  case? 


Comparison  of  Betveen-Tube  Variabilities t  The  Generalised  p-value 
Approach 

As  described  by  Tsui  and  Weerahandi  (1989),  classical  one¬ 
sided  hypothesis  tests  of  the  form  H0t  0  s  0O  versus  Hti  0  >  60, 
utilize  a  test  statistic  T(X )  that  is  simply  a  function  of  the 
sample  space,  X.  For  the  observed  response,  x,  the  critical 
region,  C„,  is  defined  as 

c^ix i  ru)*r(x)}  . 

The  p-value  associated  with  the  hypothesis  test  is  then  given  as 


P  -  «$.  . 

However,  if  this  probability  is  dependent  upon  some  nuisance 
parameter,  Tj ,  then  the  p-value  may  not  be  calculable.  This  is 
exactly  the  problem  that  exists  with  the  dynamically  indexed  tubo 
experiment . 

Tsui  and  Weerahandi  proposed  the  idea  of  a  generalized  p-value 
(GPV)  for  one-sided  hypothesis  tests  when  nuisance  parameters  are 
present.  In  lieu  of  a  test  statistic,  a  generalized  test  variable 
is  used,  which  is  not  only  a  function  of  the  sample  space,  but  also 
the  sample  data  and  the  parameters.  The  generalized  test  variable, 
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T(X;  x,  0,  ii),  is  chosen  so  that  for  all  fixed  values  of  x,  the 
following  conditions  hold: 

1)  t  -  T(x;  x,  0O,  tj)  is  free  of  tj; 

2)  the  diatribution  of  T(X /  x,  60l  if)  is  free  of  tj;  and 

3)  for  fixed  t),  Pr(T(X;  x,  0,  r\)  i  t)  ia  nondecreaaing  in  0; 

In  addition,  the  critical  region  ia  replaced  by  the 
generalized  extreme  region,  Cx(0,ij),  whose  domain  includes  the 
nuisance  parameter,  and  is  defined  to  be: 

<^(0,11)  r(J!Ox,0,ii)  zr(x;x,0,T))}  . 

Finally,  the  GPV  is  given  as: 

p»Pr{XECx(Q,i\)  |0-0O) 

-Pr(r(X/x,eo,ti)  it)  . 

With  the  above  definitions,  Tsui  and  Weerahandi  showed  that 
the  GPV  is  independent  of  the  nuisance  parameters  and  can  hence  be 
used  as  evidence  against  the  null  hypothesis. 

The  construction  of  generalized  test  variables  is  not  a 
trivial  task  and  unfortunately  little  guidance  is  given  in  the  few 
papers  that  have  been  published  on  this  topic.  Zhou  and  Mathew 
(1993)  derived  a  generalized  test  variable  that  is  used  to  compare 
variance  components  obtained  from  two  independent  mixed  hierarchial 
models.  This  methodology  was  directly  applied  to  the  between-tube 
variability  comparison.  Their  generalized  test  variable  is  given 
by: 

T(X}  X,  0,  t) )  mT(X}X,  Cjvxjrr,  0*-5rr  °x-sxti  0#-sr) 
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where  each  S3  term  is  the  random  variable  for  the  appropriate  sums- 
of-squares  and  each  ss  term  is  the  realized  value  of  SS.  These  ss 
values  are  taken  directly  from  standard  analyses  of  variance  of 
the  field  test  data. 


Although  this  generalized  test  variable  appears  to  be  very 
cumbersome,  the  calculation  of  a  GPV  is  straightforward,  under  H„, 


T{XiX,  0o,ti) 


(o|-wr+30r-z)iT)  v;-”-  +  'gg*  ST 
s&t-dzt  ‘SSr-st 


s&t-st  a*n-DXT 


( Ojt  -st+SOt-st) 


where  each  kx  is  an  observed  sum-of -squares  (a  constant)  and  each 
is  a  chi-square  random  variable. 

Furthermore,  if  X  m  x,  then  **  m  ^ ^r-dit > 

m  ss<j>„§Tf  and  S S ^ that, 

t-rU;x,  0O,T|) 

m  (oJ-oir+30j<./3jr)  { 1 )  o  e-arr  ( 1 ) 

(®l-5r+3°T-sr)  (1)  +Gjj-Djrr(l) 


2  2  2 
Qh-dit+Sot-dit+Ok-st 

o  J.sr+3  o  r-5r+0il-cir 


-1  , 


( since  o  r-Dir  *  Ot-st)  • 


303 


Finally,  the  expression  for  the  GPV  simplifies  to 


that  is,  the  GPV  is  the  relative  frequency  with  which  this  function 
of  chi-square  random  variables  exceeds  unity. 

For  known  values  of  the  sums-of-squares,  this  probability  can 
be  determined  by  simulation.  A  FORTRAN  program  simulated  50,000 
values  of  the  generalized  test  variable  and  counted  the  number  of 
times  that  it  exceeded  unity  to  obtain  the  GPV. 


Results 

Due  to  security  classification  restrictions,  the  sums  of 
squares  derived  from  the  data  cannot  be  divulged  in  this  report. 
However,  p-values  from  both  the  indirect  and  GPV  approach 
hypothesis  tests  for  all  combinations  of  direction  and  ammunition 
type  are  presented  in  Table  2. 


Indirect 

Generalized 

Direction 

Ammunition 

Test  Approach 

Test  Approach 

A 

.993  (.433) 

.989 

Azimuth 

B 

.858  (.111) 

,676 

C 

.017  (.016) 

.036 

D 

.149  (.373) 

.122 

A 

.981  (.760) 

.972 

Elevation 

B 

.779  (.745) 

.759 

C 

.253  (.560) 

.226  | 

D 

.433  (.873) 

.453 

(p-valuaa  for  the  test  of  H0:  oVJr  -  o*K_DJT  are  in  parentheses) 


Table  2.  P-values  for  the  tests  of  H0:  o*T„s r  r  ff*r.wr. 
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The  two  columns  of  p-values  for  H0:  <r  T.ST  r  aVorr  are  quite 
similar.  This  indicates  that  both  analysis  approaches  arrive  at 
the  same  basic  conclusion,  namely  that  dynamic  indexing  fails  to 
consistently  reduce  between-tuba  variability.  Only  in  one  case  out 
of  eight  (Ammunition  C  in  Azimuth) ,  was  a2 T.D1T  determined  to  be 
significantly  lower  than  oVsr* 

It  is  also  interesting  to  note  a  few  differences  in  the  two 
columns  of  p-values  for  H0 

•  ^  T-ST  ^  ^  T-DIT •  For  Ammunition  C  in 
azimuth,  the  GFV  is  more  than  double  that  of  the  p-value  obtained 
via  the  indirect  approach.  Also,  for  Ammunition  B  in  azimuth,  the 
absolute  difference  in  p-values  is  rather  large.  These  differences 
may  be  due  to  the  unequal  between-round  variability  t  associated 
with  these  data  sets  (note  the  low  parenthesized  p-va.ues  in  Table 
2).  Recall  that  the  indirect  approach  requires  that  the  between- 
round  variabilities  are  equal,  whereas  the  GPV  approach  does  not 
require  this  assumption  and  is  therefore  an  exact  hypothesis 
testing  procedure.  Violation  of  this  assumption  may  result  in 
unreliable  p-values  reported  via  the  indirect  approach. 


Summary 

For  this  particular  data  set,  both  procedures  arrived  at  the 
same  conclusions  to  the  dismay  of  the  engineers  behind  the  dynamic 
indexing  conoept.  Some  minor  differences  in  the  p-values 
highlighted  potential  problems  in  using  the  indirect  approach  to 

test  H0  !  O2  x-S?  T-DIT • 

The  procedures  for  testing  independent  between-tube 
variabilities  presented  in  this  paper  each  have  their  particular 
advantages  and  drawbacks.  The  indirect  approach  is  simple  to 
apply »  as  it  requires  only  the  use  of  F  ratios  based  on  sums-of- 
squares  taken  directly  from  independent  analyses  of  variance. 
However,  this  approach  relies  on  assumptions  made  about  the 
nuisance  parameters,  o 2R.ST  and  Failure  to  meet  the 
assumptions  may  increase  either  the  Type  I  or  Type  II  error 
probabilities. 

The  GPV  approach  s  independent  of  the  nuisance  parameters, 
and  is  therefore  an  exact  test  for  the  null  hypothesis.  The  main 
disadvantage  to  this  approach  is  that  there  is  little  guidance  in 
the  statistical  literature  on  the  derivation  of  a  proper 
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generalized  test  variable.  Furthermore,  computation  of  the  GPV 
requires  computer  simulation  of  the  generalized  test  variable,  if 
the  analyst  can  obtain  a  proper  generalized  test  variable,  the 
exactness  of  the  GPV  approach  makes  it  the  more  desirable  of  the 
two  analytical  strategies. 
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ABSTRACT 

In  the  ideal  communications  network  each  node  would  be  smart  enough  to  monitor 
network  performance  and,  when  necessary,  adapt  itself  to  better  accommodate  its  work¬ 
load.  The  adaptive  net  work  node  would  employ  a  deoision  algorithm  to  modify  conflgui  a- 
tion,  routing  and  protocol  parameters  based  on  measured  network  performance  and  sys¬ 
tem  requirements.  This  paper  describes  continuing  research  into  feasible  approaches  to 
developing  an  adaptive  network  for  use  in  battlefield  command  and  control  systems.  The 
initial  approach  entails  the  collection  of  message  traffic  information  into  a  deductive  da¬ 
tabase  fr om  which  network  performance  is  assessed  and  compared  to  system  require¬ 
ments.  Inadequate  performance  would  trigger  identification  and  assessment  of  alterna¬ 
tives  for  improvement.  The  project  emphasizes  use  of  aotual  hardware  and  controlled 
experiments  to  explore  alternatives  for  parameter  settings.  This  paper  describes  an  ini¬ 
tial  attempt  to  identify  baseline  performance  data  for  a  prototype  communications  net¬ 
work  and  to  determine  those  factors  to  which  the  system  is  most  sensitive. 

BACKGROUND 

Decentralised  battlefield  command  and  control  requires  reliable  and  timely  distribu¬ 
tion  of  information.  At  present,  information  distribution  is  limited  by  noisy  channels  and 
protocols  that  do  not  meet  traffic  demands,  forcing  commanders  to  make  decisions  from 
out  of  data  or  incomplete  information.  To  solve  this  problem,  our  research  addresses  con¬ 
trol  of  noise  and  interference  on  communication  channels  and  construction  of  network 
protocols  that  will  be  effective  on  the  modern  battlefield. 

Currently  the  civilian  sector  is  experiencing  a  communications  revolution;  however, 
civilian  applications  often  assume  a  physical  infrastructure,  such  as  towers  and  high  pow¬ 
er  base  stations,  that  is  not  always  feasible  in  a  military  environment.  Our  research  takes 
into  account  the  special  problems  of  the  battlefield:  mobility,  bandlimited  channels,  arbi¬ 
trary  or  intentional  interference,  multimedia  data,  and  rapid  pace  of  operations.  The  net¬ 
works  that  are  of  particular  interest  to  the  Army  have  nodes  with  high  computing  power 
but  weak,  noisy,  shared  communication  links.  For  this  reason,  our  approach  to  commu¬ 
nication  emphasizes  working  intelligently  at  each  node  to  limit  or  redirect  the  amount 
of  information  that  must  be  passed  along  the  communication  channel.  Each  node  is  as¬ 
sumed  to  act  independently  to  improve  the  effectiveness  of  the  information  exchange  be¬ 
tween  nodos.  Such  a  system  of  controls  requires  that  each  node  be  able  to:  monitor  the 
network  traffic;  decide  whether  performance  is  inadequate;  and  if  so,  moke  an  appropri¬ 
ate  adjustment  to  the  protocol. 
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Figure  1.  An  Adaptive  Tkcticai  Network. 


OBJECTIVE 

Protocol  parameters  such  as  packet  size,  coding  technique  and  channel  access  algo¬ 
rithm  could  be  adjusted  to  improve  or  possibly  optimize  information  transfer.  In  general, 
the  objectives  are  to  maximize  throughput  and  minimize  delay  in  the  delivery  of  informa¬ 
tion  to  the  eud  user,  where  throughput  and  delay  are  defined  as  follows: 

Network  throughput  is  the  average  number  of  bits  per  second  that  are  successfully 
transmitted  and  acknowledged  over  a  one  hour  test  cell.  This  does  not  include  such 
overhead  as  acknowledgements,  error  detection/correction  codes,  synchronization 
characters,  or,  in  the  event  of  collisions,  message  retransmissions. 

Network  delay  is  the  average  time  interval  that  passes  between  a  message’s  arrival  at 
the  host's  modem  and  the  host’s  receipt  of  the  message  acknowledgement.  Messages 
that  are  not  completely  serviced  during  the  running  of  the  test  cell  will  not  be  consid¬ 
ered  in  computing  network  delay. 

The  queation  of  how  beat  to  adapt  to  a  particular  situation  is  extremely  difficult  to 
address.  Research  into  network  protocols  and  communication  channels  will  provide  the 
underlying  foundation  required  to  identify  appropriate  network  adaptations.  However, 
because  of  the  complexity  of  these  protocols,  theoretical  research  must  be  supported  with 
carefully  designed  and  controlled  experiments  to  determine  which  network  parameters 
are  moat  useful  in  moderating  network  congestion. 
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APPROACH 


Based  on  previous  research,  several  parameters  were  selected  for  a  sensitivity  analy¬ 
sis:  retry  interval,  the  time  to  wait  for  an  acknowledgement  before  retransmitting;  win¬ 
dow  size,  the  number  of  outstanding  messages  permitted  before  transmission  is  blocked; 
message  length,  the  number  of  characters  in  each  message;  and  arrival  rate,  the  number 
of  messages  per  hour  queued  for  transmission  at  each  node. 

A  prepilot  test  has  been  conducted  to  determine  thresholds  for  retry  interval,  window 
size,  message  length  and  arrival  rate.  Next  a  pilot  test  will  be  executed  to  screen  each  of 
the  four  parameters  for  possible  elimination.  Finally  an  experiment  will  be  designed  and 
executed  to  measure  throughput  and  delay  under  each  of  the  test  cell  conditions. 


EXPERIMENTAL  CONFIGURATION 

The  experimental  hardware  consists  of  the  equipment  shown  in  Figure  2.  The  com¬ 
puters  are  Tadpole  SPARCbook  I’s  each  with  32  megabytes  of  memory.  These  are  con¬ 
nected  to  a  Harris  Black  Box  Radio  Emulator  via  Harris  Tactical  Data  Buffers  (TDB).  The 
TDB  provides  an  interface  between  VHF  transceivers  and  digital  computer  equipment. 


Figure  2.  Experimental  Configuration 

In  providing  this  service  the  TDB  performs  the  following  tasks:  data  modulation/demod¬ 
ulation,  error  detection/correction,  and  compensation  for  unequal  terminal  and  radio 
link  data  rates. 
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The  SPARCbooks  are  connected  to  a  SUV  4/280  serving  as  the  data  storage  and  data 
reduction  machine.  The  software  residing  on  the  SUN  generates  messages,  logs  message 
traffic,  and  identifies  message  retries  and  delay.  To  minimize  blocking  and  possible  er¬ 
rors,  input  is  read  from  text  files  in  a  predefined  order.  Through  the  software,  the  exper¬ 
imenter  can  interactively  select  which  test  cell  and  iteration  to  execute  next. 


PREPILOT  RESULTS 

The  prepilot  test  was  conducted  to  determine  thresholds  for  retry  interval,  window 
size,  message  length  and  arrival  rate  and  to  explore  limitations  of  the  software,  Figure 
3  illustrates  the  various  factors  explored. 

During  this  period  it  was  found  a  window  size  of  one  resulted  in  overflow  errors  that 
prevented  data  transmission.  Average  message  delays  were  computed  over  one  minute 
intervals  to  insure  the  sampling  was  sufficient  to  identify  the  warm  up  period.  Software 
requires  further  development  to  support  folly  automated  execution  of  an  entire  replica¬ 
tion  and  to  accommodate  more  nodes  in  the  network, 


FACTOR 

LEVELS 

Retry  Timeout  (seconds) 

10 

40 

Window  Size  (messages) 

8 

60 

Message  Length  (characters) 

80 

240 

Arrival  Rate  (messages/node) 

200 

600 

Figure  3.  Frepilot  Study  Factors  and  Levels 


FUTURE  WORK 

When  software  modifications  are  completed,  the  pilot  teBt  will  be  conducted  to  ex¬ 
plore  the  need  to  eliminate  or  refine  the  levels  of  investigation  for  any  of  the  factors,  The 
number  of  replications  will  be  dependent  upon  the  duration  of  the  test  cells  and  the 
amount  of  automation  introduced,  A  tall  factorial  design  will  be  implemented.  The  pa¬ 
rameters  selected  for  this  test  are  those  which  can  be  easily  modified.  Future  experiments 
will  consider  more  complex  protocol  modifications, 
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ABSTRACT .  The  Army  Mobility  model  (AMM)  developed,  at  the 
U.S.  Army  Engineer  Waterways  Experiment  Station,  uses  the  data  from 
about  a  hundred  factors  that  describe  a  vehicle  terrain  unit,  road 
unit,  or  linear  feature  to  predict  vehicular  speeds.  Recently, 
Monte  Carlo  simulations  were  conducted  for  several  wheeled  and 
tracked  vehicles  and  different  areas,  varying  some  selected  groups 
of  these  factors  plus  and  minus  10  percent  about  their  nominal 
values.  The  results  of  these  simulations  have  been  studied  to 
develop  empirical  relationships  that  allow  the  expression  of 
confidence  measures  for  the  speed  predictions  on  an  entire  mobility 
map.  As  a  first  step,  programs  have  been  written  to  test  methods  to 
estimate  the  value  of  continuous  statistical  parameters  (the  mode 
and  its  standard  deviation)  of  a  discrete  histogram.  This  allows 
theorems  of  mathematical  statistics  to  be  applied  to  the  confidence 
levels  around  the  values  of  the  parameters.  The  method  uses  a 
variation  I  made  on  E.  Parzen's  formula  for  the  location  of  the 
mode  of  the  continuous  distribution  associated  with  a  discrete 
histogram.1  The  formula  works  by  estimating  the  rate  of  an 
associated  statistical  process  by  discrete  windows  (Jth  waiting 
times) .  The  incomplete  gamma  function  and  a  maximum  liklihood 
product  is  then  used  to  estimate  the  parameters.  2  This  approach 
has  been  tested  for  a  range  of  Monte  carlo  generated  discrete 
approximations  to  gamma  distributions.  It  was  then  applied  to  the 
histograms  of  possible  errors  in  speed  predictions  of  tactical 
vehicles  moving  across  areas  on  different  mapsheets.  These 
histograms  were  generated  previously  in  the  course  of  the  work  by 
Lessem  and  Ahlvin  and  are  discussed  in  reference  [6]. 


lSee  Parzen,  Emanuel,  "Stochastic  Proceses",  Holden  Day,  1962, 
and  Press, W. ,  Flannery,  B.  et  al.,  "Numerical  Recipes  in  C,"  2nd 
ed.,  Cambridge  U.  Press,  1988. 
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2  Ibid. 


In  trying  to  determine  how  to  organize  the  sensitivity  trials 
in  this  particular  set  of  programs  and  data  there  are  several 
approachs  that  can  be  taken.  Because  the  speed  prediction  program 
uses  a  series  of  lookup  tables  and  flow  chart,  "yes"  or  "no",  go 
and  no-go  cutoff  rules,  points  at  which  the  program  computes  a  no- 
go  output  are  natural  areas  to  investigate  its  sensitivity  to 
errors  in  the  data.  Error  measures  can  be  associated  with  "critical 
regions"  in  the  data  around  these  points.  Determining  the  modes  and 
moments  in  the  discrete  non-parameteric  histograms  generated  by  the 
sensitivity  trials  gives  a  way  of  characterizing  and  reproducing 
the  confidence  in  information  contained  in  the  program's  output 
involving  these  regions.  One  approach,  which  measures  the  program's 
"inherent  sensivity"  to  errors,  is  terrain-independent  and  vehicle 
dependent.  It  examines  the  code  in  the  program  to  find  the  1-factor 
critical  regions  in  the  outputs  of  the  Monte  Carlo  trials.  It  then 
adjusts  the  values  of  the  other  factors  in  a  detrimental  direction 
of  the  lookup  table  values  until  the  2,3  and  higher  multi-factor 
critical  regions  are  identified.  Another  approach  is  "project 
specific"  and  is  both  terrain  dependent  and  vehicle  dependent.  It 
looks  at  the  areas  on  the  speed  prediction  maps  where  no-gos  occur. 
It  then  goes  back  to  the  input  files  to  determine  the  values  of  the 
data  at  the  terrain  units  where  these  no-gos  occur.  This  is  the 
approach  that  will  be  taken  in  this  paper. 

After  the  procedure  for  conducting  the  trials  is  determined  it 
is  important  to  consider  ways  to  examine  confidence  levels  for  the 
parameters  that  are  estimated.  One  approach  to  this,  which  recently 
has  gained  popularity,  is  the  technique  of  bootstrapping.  This 
technique  conducts  Monte  Carlo  trials  of  the  Monte  Carlo  trials. 
The  algorithm  resamples  not  from  the  original  data,  but  from  a 
smoothed  kernel  estimate  of  the  data  (see  MathCad  [8]  for  the 
details  of  the  algorithm  and  Efron,  Hall  and  Tittleman,  and  Scott 
for  the  theory  behind  formulas  for  the  variance  of  the  sampled 
estimate  of  the  parameter) .  Smoothed  kernel  formulas,  introduced  by 
Parzen  and  others  (see  Scott  [12],  Parzen  [9])  allow  better 
resolution  of  modes  and  other  information  in  the  data  using  a  given 
histogram  bin  size  or  window.  In  order  to  estimate  the  second 
moment  or  the  variance  of  the  kernel  estimate,  it  is  necessary  to 
write  programs  to  compute  the  second  derivative  of  the  frequency 
polygon  of  the  histogram  (see  Scott  [12]) .  Bootstrapping  confidence 
intervals  can  then  also  be  computed  from  this  information. 

In  this  paper  a  somewhat  simplified  approach  is  taken.  A 
leave-one-out  maximum  liklihood  product  of  smoothed  kernels  over 
different  possible  bin  widths  is  taken.  The  product  is  taken  over 
a  choice  of  possible  bin  widths,  once  the  best  bin  width  is 
determined  the  variance  of  the  kernel  associated  with  this  bin 
width  is  computed  (see  Numerical  Recipes  in  C,  2nd  ed.  [10])  This 
aggregrates  the  data  in  a  one  dimensional  histogram  and  does  not 
give  you  as  much  information  as  in  the  more  complicated  multi¬ 
dimensional  approach. 

Figures  1,  2,  3,  and  4  show  the  results  of  a  series  of  Monte 
Carlo  error  sensitivity  trials  run  on  some  vehicle  speed 
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Figure  1 


Figure  2  Monte  Carlo  Sensitivity  Runs,  M113 
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Figure  4  Monte  Carlo  Sensitivity  Trials,  M-l 
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predictions  by  Lessem  et  al.  [6].  They  display  the  speeds  predicted 
for  the  M998  High  Mobility  Multi-Purpose  Vehicle  (HMMV) ,  the  M977 
10-Ton  Heavy  Expanded  Mobility  Tactical  Truck  (HEMTT) ,  M113 
Armoured  Personnel  carrier ,  and  the  M-l  tank.  The  terrain  areas 
tested  are  in  Yakima,  Washington,  Gran jean  Wells,  New  Mexico  and 
Bachelor,  Australia.  The  graphs  have  predicted  speeds  plotted  on 
the  horizontal  axis.  The  speeds  were  computed  by  varying  nine 
factors:  soil  strength,  slope,  surface  roughness,  visibility, 
vegetation  type,  and  four  other  attributes  dealing  with  obstacle 
characteristics  around  their  nominal  values  in  a  certain  terrain 
unit.  The  nominal  values  for  that  terrain  unit  were  chosen  as  the 
points  around  which  the  vehicle's  performance  on  the  mapsheet 
terrain  units  changed  most  noticeably.  The  points  were  determined 
by  referring  both  to  the  output  that  the  program  computed  and  to 
the  tables  in  the  speed  computation  program  where  the  performance 
changed  significantly.  On  the  vertical  axis  is  a  count  of  the 
number  of  occurences  of  a  given  speed  for  that  vehicle,  that 
terrain  unit,  and  for  the  range  of  Monte  Carlo  trials  used.  Both 
uniform  and  normal  density  functions  were  used  to  compute  the 
randum  numbers  used  in  the  Monte  Carlo  sensitivity  trials.  Thus 
the  graph  displays  the  areal  sensitivity  of  the  speed  predictions 
for  that  vehicle  in  that  area.  Notice  that  the  results  don't  appear 
to  have  a  common  probability  density  function.  The  WES  technical 
reports  by  Lessam  et.  al.  (6]  and  [7]  contain  a  more  detailed 
discussion  of  the  features  of  the  mobility  programs  which  cause  the 
histograms  to  assume  these  shapes. 

In  general,  these  histograms  will  separate  into  several  parts 
each  with  distinct  characteristics.  In  this  particular  case  parts 
of  the  graphs  associated  to  each  single  mode  were  separated  out. 
Let  us  assume  this  has  already  been  done.  We  arrange  the  results  of 
the  Monte  Carlo  simulations  in  a  histogram  of  N  bins  with  the 
number  of  Monte  Carlo  hits  (test  items)  in  the  ith  bin  equal  to 
hist,.  In  order  to  estimate  the  number  of  Monte  Carlo  trials 
necessary  to  reproduce  the  probability  density  function  from  which 
these  results  give  samples  we  have  to  use  an  unstructured  or 
nonparametric  approach.3  Let  us  define 


(1.1) 


p  ( t+l/2  *  J) 


Thist, 

.  fit _ 

N+J 


where  t  *  bin  number  around  which  estimate  is  centered 

J  =  integer  t  1 

N  =  total  number  of  observations 


3  Keinosuke  Fukunaga,  "Introduction  to  Statistical  Pattern 
recognition,"  Academic  Press,  2nd  ed. ,  1991. 
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According  to  the  reference  by  Fukunaga  [3],  this  formula  gives 
the  Parzen  density  estimate  for  the  value  of  this  probability 
density  function  at  the  point  k  -  t+  3/2.*  In  this  formula  we  are 
using  a  local  region  defined  by  a  window  of  size  J  around  the  point 
to  estimate  the  number  of  hits  in  a  counting  process  in  terms  of 
the  histogram  values  located  in  this  region.  This  formula  gives 
estimates  for  the  values  of  the  density  function  at  N-J  points. 
Sorting  these  estimates  and  picking  out  the  middle  and  highest 
values  then  gives  the  best  prediction  of  the  mode  and  the  mean  of 
the  histogram  using  windows  of  size  J.  On  page  261  of  this 
reference  the  value  of  the  standard  deviation  of  this  estimate  is 
calculated  to  be: 


t+j 

(1.2)  0(t+l/2  *J)  si3 - 

J+t/J+N 


Note  that  the  value  of  this  standard  deviation  refers  to  an 
interval  around  a  point  on  the  x-axis  of  the  histogram  and  not 
around  the  height  of  the  histogram  or  number  of  Monte  Carlo  values 
in  that  bin. 

These  formulas  and  theorems  allow  a  leave-one-out  procedure 
along  with  a  maximum  liklihood  product  to  be  used  to  estimate 
thevalue  of  the  window  size  which  gives  the  smallest  error  in 
estimating  the  parameters.5 

Using  our  procedure  for  computing  estimates  of  the  value  of 
the  probability  distribution,  at  the  point  k  defined  in  equation 
1.1  the  function  p(k)  is  proportional  to  the  amount  the 
cumultative  distribution  function  changes  in  this  interval...  so, 
the  larger  it  is,  the  better  is  the  chance  for  a  local  maximum  of 
the  probability  distribution  function  at  that  point.  The  program 
computes  estimates  of  the  continuous  modes  for  different  window 
sizes,  where  J  -  window  size,  x,  -  bin#  of  largest  of  these 
estimates,  p(k)  «  weighted  estimate  of  mode  at  this  bin  -  (sum  of 
#  of  distribution  hits  in  the  bins  inside  a  window  of  width  J 
centered  at  k)/ (total  #  shots)*  J,  In  the  case  where  the 


4  Actually,  this  is  the  density  function  of  a  "renewal  counting 
process"  as  defined  in  Parzen  [9]. 

5  See  besides  the  Numerical  Recipes  in  C  reference  also  the 
Introduction  to  Statistical  Pattern  Recognition  text  referred  to 
above.  These  same  procedures  can  be  used  to  characterize  the 
histogram  distribution  of  pixel  intensities  in  digital  images.  Such 
a  characterization  allows  the  use  of  various  neural  network 
learning  procedures  to  be  used  to  identify  the  images. 
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distribution  function  is  suspected  to  be  bimodal,  this  procedure 
will  identify  at  least  the  top  two  inodes  when  it  is  iterated  over 
different  window  sizes. 


Let  5  (J)  »the  range  of  data  values  around  the  candidate  for  mode 


calculated  using  a  window  size  of  J. 


l  ♦  -j 

Thus,  5{J)  -  J2  histL  . 

im**~  i 

Then,  in  this  notation,  the  probability  distribution  pj(k)  of 
the  smoothed  estimate  of  the  original  data  is  given  by:4 


Pj(k) 


ft(J) 

jv  J 


Let 


i  -  it  + 


(2.1)  fij(ic)-  53  Aistj 


i  f 


Let  H  ( J)  ■»  the  hypothesis  that  the  true  mode  x,  has  been 
identified  by  considering  a  window  of  size  J.  We  want  to  consider 
how  likely  it  is  that  the  range  around  x,  should  be  shorter  than  it 
is  observed  to  be.  Let  P(a,x)  be  the  incomplete  gamma  function: 

X 

P{a,x)  ■■jrj—y  je^t^dt 


where : 


4  See  the  discussion  in  Numerical  Recipes  in  C  edition  1  and 
also  the  book  by  Parzen,  pages  133-134. 
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T(a)  ■  J ©"eC-"1cft 


Thua,  P(a,x)  ia  tha  cumulative  Poiason  probability  distribution 
function/  Prob(X  <-  a)  for  tha  Poiason  probability  distribution  X. 
It  ia  dafinad  aa  tha  probability  that  tha  number  of  Poisson  random 
avanta  occur ing  will  ba  batwaan  0  and  a  -  1.  Each  of  thaaa  random 
avanta  will  hava  a  probability  of  occuranca  of  N*p,. 

Tha  probability  that  tha  range  around  x,  ia  actually  shortar  than 
obaarvad  to  ba  if  H(n)  ia  trua  instaad  of  H(J)  ia7: 


f  (NPjU)  )  1% ffiijy'i'’  ‘dc 

If  wa  lat: 


y  •  Npj(n)  t 

a  -  n 
x  -  fij(J) 


in  tha  abova  aquation  , 
than  it  ia  aqual  to: 


Pin, 


\ 

NpjinT’ 


which  in  tha  aama  aa: 


Pin,  J 


TJUX' 


Taking  tha  product  of  all  thasa  factors  for  each  mode  xn  then  givaa 
tha  likalihood  that  tha  ranga  around  x„ should  be  shorter  than  the 
ranga  obaarvad  around  x,  for  all  n  othar  than  J. 

Thua  tha  likalihood  function  is  defined  by  Likelihood (H(J) ) : 


7  Farzan,  Ibid  pp.  133-134. 
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(2.2)  L(n\J)  •n.tof  in,  /*,[!!)) 


The  program  then  computes  the  value  of  this  window  size  J  that 
maximizes  the  likelihood  function,  given  a  set  of  arrayed  a 
posteriori  error  sizes. 


More  precisely,  the  steps  in  the  computation  are: 

1)  Compute  the 


according  to  equation  (2.1)  for  the  points  corresponding  to  each 
bin. 

2)  Compute  the  maximum  liklihood  products  according  to  aquation 
(2.2)  in  order  to  determine  the  optimal  window  size. 


3)  Compute  the  weighted  sums  p(k)  according  to  aquation  (1.1) 
and  the  standard  deviations  according  to  equation  (1.2)  for  the 
points  corresponding  to  to  each  bin. 

Because  of  the  nonparametrio  form  of  the  Parzen  density 
estimate,  the  procedures  will  work  for  any  empirically  determined 
histogram.  A  discrete  sorting  procedure  normally  gives  a  pretty 
good  estimate  of  the  value  of  the  mean  and  mode  (even  assuming  the 
actual  distribution  is  continuous) .  However,  in  order  to 
approximate  the  size  of  the  standard  deviation  in  the  estimation  of 
the  parameters,  it  is  necessary  to  use  the  maximum  likelihood 
estimators.  These  estimators  of  the  best  window  sizes  will  result 
in  good  approximations  of  the  parameters. 
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An  example  of  how  these  parameter  estimates  work  is  shown  as 
it  is  applied  to  the  resulta  of  Monte  Carlo  sensitivity  runs  in 
Figures  1,2 ,3, 4.  The  simulations  shown  in  the  figures  were 
conducted  for  four  vehicles  the  HMMV,  the  M997  trailer  transporter, 
the  M113  APC,  and  the  M-l  tank.  The  top  charts  show  the  results  for 
a  mapsheet  Yakima  Proving  Grounds  and  the  bottom  charts  those  for 
a  mapsheet  including  Batchelor  Australia. These  figures  show  the 
results  of  varying  the  parameter  values  plus  and  minus  10  per  cent 
around  their  nominal  values.  Nominal  values  are  defined  as  the 
vehicle  parameters  plus  the  specific  parameter  values  in  each 
terrain  unit.  For,  this  analysis,  we  considered  the  particular 
values  for  which  that  \ahicle  experiences  a  go,  no-go  situation, as 
the  values  around  which  variations  were  made. 

Data  from  the  M997,  M113,  and  M998  runs  were  extracted 
directly  from  the  top  row  of  histograms  in  Figures  1,2,  and  3 
respectively.  Programs  were  written  to  expand  the  information  into 
a  20  bin  histogram  and  to  scale  the  data.  This  turned  out  to  be  a 
good  range  for  the  incomplete  gamma  function  to  discriminate  the 
maximum  likelihood  estimates.  The  results  of  the  program  runs  are 
shown  below.  First  the  program  calculates  a  value  for  the  mode  by 
simply  sorting  the  columns  of  the  histogram.  This  is  called  a 
discrete  estimate.  The  abscissa  of  this  point  is  called  model.  Then 
the  program  computes  the  optimal  window  size  for  smoothing  the  data 
using  the  leave-one-out  maximum  likelihood  procedure  explained 
above  and  determines  a  continuous  estimate  for  the  mode  along  with 
a  standard  deviation.  Both  of  these  numbers  are  computed  using  this 
optimal  window  size. 

The  results  are  shown  below; 

histogram  of  Monte  Carlo  error  runs 
M998  Yakima-15  9-factor-terrain  (  mode#l  ) 


X 
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graph : 
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Data  drawn  from  a  histogram  of  Monte  carlo  sensitivity 
to  errors  in  terrain  factors 

Discrete  estimate  of  mode  of  data  set  is  42.500000 

Discrete  estimated  value  of  model"  5.650001 

Probability  of  mode  detected  at  window  size  3  is  0.229365 

Probability  of  mode  detected  at  window  size  4  is  0.253296 

Probability  of  mode  detected  at  window  size  5  is  0.256476 
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Probability  of  mode  detected  at  window  size  6  is  0.014268 
Probability  of  mode  detected  at  window  size  7  is  0.014372 

Moat  likely  window  size  is  5  value  of  mode  is  32.50000 

Continuously  estimated  value  of  modei-5. 10000 

Standard  deviation  of  the  continuous  estimate  (for  this  window 
size)  is  0.607092 


histogram  of  Monte  carlo  error  runs 
M998  Yakima-15  9-factor-terrain  (  moda#2  ) 
x  p(x)  graph: 
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0.0000  0.0000 

Data  drawn  from  a  histogram  of  Mont*  Carlo  sensitivity 
to  errors  in  terrain  factors 

Discrete  estimate  of  mode  of  data  set  is  11.0000000 

Discrete  estimated  value  of  modei-19. 549995 

Probability  of  mode  detected  at  window  size  3  is  0.204653 

Probability  of  mode  detected  at  window  size  4  is  0.039479 

Probability  of  mods  detected  at  window  size  5  is  0.116221 

Probability  of  mode  detected  at  window  size  6  is  0.065450 

Probability  of  mode  detected  at  window  size  7  is  0.129556 

Most  likely  window  size  is  3  value  of  mode  is  11.0000000 
Standard  deviation  of  the  continuous  estimate  (for  this  window 
size)  is  0.269430 

Continuously  estimated  value  of  modei-ll. 00000 

histogram  of  Monte  Carlo  error  runs 
M997  Yakima-15  9-factor-t*rrain  (  mode#l  ) 

x  p(x)  graph: 
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Data  drawn  from  a  histogram  of  Monts  Carlo  sonsitivity 
to  srrors  in  terrain  factors 

Discrete  estimate  of  mode  of  data  set  is  40.500000 
Discrete  estimated  value  of  model"  3.250000 

Probability  of  mode  detected  at  window  size  3  is  0.282627 

Probability  of  mode  detected  at  window  size  4  is  0.064773 

Probability  of  mode  detected  at  window  size  5  is  0.076770 

Probability  of  mode  detected  at  window  size  6  is  0.083600 

Probability  of  mode  detected  at  window  size  7  is  0.084213 

Most  likely  window  size  is  3  value  of  mode  is  40.500000 
Standard  deviation  of  the  continuous  estimate  (for  this 
size)  is  1.253331 

Continuously  estimated  value  of  modei-3. 250000 

histogram  of  Monte  Carlo  error  runs 
M977  Yakima-15  9-factor-terrain  (  mode#2  ) 
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Data  drawn  from  a  histogram  of  Monte  Carlo  sensitivity 
to  errors  in  terrain  factors 


Discrete  estimate  of  mode  of  data  set  is  6.000000 
Discrete  estimated  value  of  modei-  8.049999 


Probability  of  mode  detected  at  window  size  3 
Probability  of  mode  detected  at  window  size  4 
Probability  of  mode  detected  at  window  size  5 
Probability  of  mode  detected  at  window  size  6 
Probability  of  mode  detected  at  window  size  7 
Most  likely  window  size  is  7  value  of  mode  is 
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Continuously  estimated  value  of  modei-8 . 199999 

histogram  of  Monte  Carlo  error  runs 
M113  Yakima-15  9-factor-terrain  (  mode#l  ) 
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Data  drawn  from  a  histogram  of  Monte  Carlo  sensitivity 
to  errors  in  terrain  factors 

Discrete  estimate  of  mode  of  data  set  is  13.799999 

Discrete  estimated  value  of  modei-  4.300000 
standard  deviation  is  0.283068 

Probability  of  mode  detected  at  window  size  3  is  0.003206 

Probability  of  mode  detected  at  window  size  4  is  0.020749 

Probability  of  mode  detected  at  window  size  5  is  0.114160 


323 


Probability  of  mode  detected  at  window  size  6  is  0.132713 
Probability  of  mode  detected  at  window  size  7  is  0.369124 
Most  likely  window  size  is  7  value  of  mode  is  13.799999 
Standard  deviation  of  the  continuous  estimate  (for  this  window 
size)  is  0.283068 

Continuously  estimated  value  of  modei«13. 79999 
Summary  of  Mode  Estimates  for  data 

Discrete  estimate  of  mode  of  data  set  1  is  point  5.650001  at 

42.500000 

continuous  estimate  of  mode  of  data  set  1  is  point  5.100000  with 
value  32.500000 

A  window  of  size  5  was  used  to  estimate  this 

Discrete  estimate  of  mode  of  data  set  2  is  point  19.549995  at 
11.000000 

continuous  estimate  of  mode  of  data  pet  2  is  point  19.549995  with 
value  11.000000 

A  window  of  size  3  was  used  to  estimate  this 

Discrete  estimate  of  mode  of  data  set  3  is  point  3.250000  at 

40.500000 

continuous  estimate  of  mode  of  data  set  3  is  point  3.250000  with 
value  40.500000 

A  window  of  size  3  was  used  to  estimate  this 

Discrete  estimate  of  mode  of  data  set  4  is  point  8.049999  at 
6.000000 

continuous  estimate  of  mode  of  data  set  4  is  point  8.199999  with 
value  5.500000 

A  window  of  size  7  was  used  to  estimate  this 

Discrete  estimate  of  mode  of  data  set  5  is  point  4.300000  at 
13,799999 

continuous  estimate  of  mode  of  data  set  5  is  point  4.300000  with 
value  13.799999 

A  window  of  size  7  was  used  to  estimate  this 

In  summary,  using  this  technique  of  estimation  for  finding 
modes  there  is  in  one  case  (data  set  1)  about  a  10  percent  increase 
in  the  accuracy  of  the  determination  of  its  location.  This  makes 
available  a  more  accurate  fix  on  the  N0G0  program  vehicle  speed 
values  around  which  to  do  the  sensitivity  analyses.  Also, 
determination  of  the  optimal  window  size  to  usr  in  the  estimate, 
gives  a  means  to  non-parametrically  estimate  the  standard  deviation 
of  the  sensitivity  analyses  results.  This  then  tells  us  how  many 
Monte  Carlo  trials  should  be  used  to  explore  the  program's 
senstivity  to  variations  in  the  values  in  its  internal  tables  and 
input  data.  For  example,  for  the  two  runs  concerning  the  M977 
performance,  one  mode  has  a  determination  with  a  standard  deviation 
of  1.253  and  the  other  with  a  standard  deviation  of  .1039.  After 
determining  this,  you  could  then  go  back  and  run  10  times  more 
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Monte  Carlo  trials  around  the  first  mode.  Similarly,  althougn  it 
was  not  analyzed  for  this  paper,  the  determination  of  a  mode  in 
the  case  of  the  M-l  tank  is  much  less  well  defined.  Locking  at  the 
Monte  Carlo  sensitivity  histogram  in  the  top  part  of  Figure  4,  it 
is  clear  that  in  this  case  the  predictions  will  be  less  accurate. 
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